Goveramert 


( CM pre-e cee 
Neos ae CJ IK | fm © \Publications 
ee adel | ig a | 


i 
Wis ctat 


| CARGV 
DE VIL 
BL af 
INFORMATION BULLETIN NO. 7 
—_——_n 
, 
| —- SELECTION OF APPLICANTS 
—— TO A FACULTY OF EDUCATION: 
Te A CASE STUDY 
= 
—= 
== 
| —=-. 
——— 
——=- 


W.S. PERUNIAK 
M.W. WAHLSTROM 
E.L. WEINSTEIN 


MAY, 1978 


COMMISSION ON DECLINING SCHOOL ENROLMENTS IN ONTARIO (CODE) 


R.W.B. JACKSON HOWARD B. HENDERSON 
COMMISSIONER EXECUTIVE SECRETARY 
252 Bloor Street West 
Toronto, Ontario 
M5S 1V6 


SELECTION OF APPLICANTS 
TO A FACULTY OF EDUCATION: 
A CASE STUDY 


W.S. Peruniak 
M.W. Wahlstrom 
E.L. Weinstein 


This paper was commissioned by and prepared for the Commission 
on Declining School Enrolments in Ontario and is not to be cited 
or quoted without the permission of the Commission and the authors. 


This study reflects the views of the authors and not necessarcly 
those of the Commission on the Ministry of Education. 


VII 


REFERENCES 


APPENDICES 


TABLE OF CONTENTS 


INTRODUCTION 


HISTORICAL CONTEXT 


REVIENROF SELECTION: ISSUES 


GENESIS OF THE RATING INSTRUMENT 


INTERVIEW PROCEDURES 


ANALYSIS OF THE DATA 


RESULTS AND CONCLUSIONS 


28 


38 


43 


63 


Digitized by the Internet Archive 
In 2024 with funding from 
University of Toronto 


https://archive.org/details/31/61118906 742 


CHAPTER I 


INTRODUCTION 


The main purpose of this study was to examine the Faculty of 
Education, Queen's University, data file of 840 applicants selected 
to be interviewed for admission to the B.Ed. program which resulted 
in a 1971-72 class of 640. A follow-up of the 640 students was conducted 
to determine how many had not received certification, who had accepted 
teaching positions and were currently employed by an Ontario school 
board in 1977-78, and which candidates were not employed as teachers in 
Ontario. Because an extended and a structured methodological procedure 
was used in selecting the 1971-72 class, based upon previous admissions 
procedures of Queen's and information from the research literature, it 
seemed appropriate to use this particular class for follow-up purposes 
to determine the relationship between data obtained in the admissions 
process and information about their current role in the Ontario educa- 
tional system. 


Data tapes from the Ministry of Education provided basic informa- 
tion on who was teaching during the school year 1976-77, as up-dated in 
files effective June 1977. When this study began it was not possible to 
obtain current data for the 1977-78 school term from Ministry of Education 
computer files about place of employment and a given teacher. The June 
1977 information yielded a list of 557 certified teachers who had graduated 
from the 1971-72 class, of which 346 were currently employed, and was used 
to determine place of employment. Information about the remaining 83 
students was provided by officials at Queen's University. The majority of 
the 83 students were not residents of either Ontario or other provinces 
and had not been included in this particular Ministry of Education computer 
file. Principals from 255 schools were contacted via telephone to provide 
information about the 346 teachers. This was done according to a structured 
interview schedule that provided data that was related to information obtained 
during the 1971 admissions interview and the 1971-72 teaching experience of 
the Queen's B.Ed. program. 


Information in this report deals largely with the Faculty of 
Education, Queen's University, experience in structuring an admissions 
procedure based upon their experience and information from the research 
literature. Some six years later, by means of this study, we examined 
the validity of the admissions measures and are assembling research data 
on the teachers who have remained in the teaching force over the past 
six years, as described by their respective principals. We acknowledge 
that this is a global assessment and may have personality characteristics 
involved in the descriptions, but this is a first step and, indeed, a 
needed one as supervisory officials usually are charged with personnel 
selection and assessment. Only now, both in the research literature and 
in our schools, is a focussed concern developing on how to assess teachers 
and teaching effectiveness. That is not to say there is a dearth of 
material on the topic; indeed, there is an abundance of material but it 
has little apparent utility for school system application. 


In this report we outline, in case study format, the events at Queen's 
related to admissions issues and solutions. As well, Chapter III outlines 
the research literature on admission and selection issues and provides a 
summary of conclusions that are relevant in 1978. Most of the literature 
cited is of relatively high quality and describes issues clearly. A dominant 
problem remains, namely, the so-called criterion problem. Before we can 
gain significantly in the quality of admissions decisions, we must pay 
particular attention to defining the attributes of a "good teacher" in 
terms of those deemed successful or effective as teachers, if this is to 
be a dominant criterion in the admissions phase of the B.Ed. program. 
Although somewhat dated, results from Fishman's (1958) work entitled 
"Unsolved Criterion Problems in the Selection of College Students" provides 
a basis for further study in the area of admissions decisions. 


We anticipate that the results and conclusions presented in the 
report will be but one of the first steps in Ontario for structuring 
further studies to determine the nature of effective teaching and to 
describe "good" characteristics desired for selected candidates by a 
faculty of education. 


CHAPTER I] 


HISTORICAL CONTEXT 


During the period of educational expansion in the 1960's, the 
major issue in the teacher supply question concerned recruitment. 
Increasingly, during the period of educational contraction in the 
1970's, the problem of teacher selection is supplanting teacher 
recruttment as the central challenge confronting the teaching 
profession, government authorities, and universities. Almost every 
teacher training institution in the province has recently experienced 
the difficulty of selecting from a large pool of applicants the number 
allowable under fixed enrolment targets. As the school system continues 
to shrink in the years immediately ahead, faculties of education may 
be expected to reduce the number of places in their training programs 
Still further. When this development is combined with a relatively 
constant number of university graduates looking for professional 
training, it is safe to forecast that faculties of education will face 
unprecedented difficulties in selecting, on supportable grounds, the 
small cadre of teacher trainees required by the school system. 


The difficulty of meeting that challenge is compounded by the 
dearth of accumulated experience. Expertise acquired during the years 
of recruitment does not directly transfer to the different needs 
associated with rigorous screening. Faculties of education cannot 
simply revert to the historic university mode of selection and base 
their admission criteria on academic standing. Their sense of respon- 
sibility to the public and to the profession would preclude them from 
resorting to such a simplistic solution. Unlike many other professions, 
teaching involves more than a delivery of services; it involves an 
intense reciprocal process with learners that calls for teachers with 
relational and interactive skills as well as intellectual competence. 


The primary purpose of this report is to describe a case study of 
a systematic admissions procedure used at Queen's University and, 


thereby, to contribute to the pool of shared experience. Although 

the number of studies on admissions procedures number in the thousands, 
the incremental gain in knowledge from many studies is minimal. This 
study systematically builds upon previous experience and is now, in 
its sixth year, concerned with an evaluation of the procedures by a 
follow-up of selected candidates. 


When the Faculty of Education, Queen's University (known then as 
McArthur College of Education) first opened for classes in September, 
1968, the major question regarding admissions was whether enough 
teacher candidates could be attracted to give the new Faculty and the 
new program a measure of credibility. The vast majority of university 
graduates who were interested in becoming secondary school teachers 
were still opting to enter the profession via the emergency two- 
summer-sessions pattern. Ironically, the heavy enrolment in the Queen's 
summer program of teacher preparation (which had functioned for several 
years as a satellite operation of the Ontario College of Education) 
contributed to the low enrolment of the fledgling winter program. Even 
with the inducement of a Bachelor of Education degree (the first such 
degree to be offered in Ontario for pre-service education), Queen's 
attracted only 189 candidates for the B.Ed. degree program in its 
first year. Under these circumstances, the initial concerns regarding 
enrolment focussed more on promotion and recruitment than on selection. 


Almost overnight the situation changed. Word spread that there were 
novel features to the Queen's program. Consequently, some 1,300 
university graduates applied for admission to the second year of the 
Faculty's operation. The enrolment target for that year, 1969-70, 
however, had been fixed at 200, with little possibility for any 
appreciable expansion until additional student residences promised by 
the province were built. Thus it happened that during the very first 
year of its existence the Faculty was confronted by the inescapable 
necessity for selective admissions. 


It took several months to grasp the full dimensions of the problem. 
Then, early in 1969, the Dean appointed an advisory committee on 


admissions to formulate a set of guidelines for selection. (Under the 
interim Faculty board structure there was yet no standing committee on 
admissions.) This advisory committee established academic grade thres- 
holds in the various curriculum options and recommended that those 
applicants who met the requirements (generally a high B average) be 
interviewed by panels consisting of two or three instructors. By means 
of that academic eligibility filter, the Faculty succeeded in reducing 
the pool of 1,300 applicants to about 500 who were invited for interview. 


An assessment form was devised to focus interviewers' attention 
on particular characteristics; namely, enthusiasm for subject, clarity 
of conversation, vitality, commitment, openness, and maturity. A final 
summation section forced interviewers to assign a global rating on a 
10-point scale (later reduced to six points) ranging from not suitable 
to outstanding. Applicants were then rank-ordered on the basis of 
interview scores within their respective curriculum areas. An actual 
enrolment of 220 was generated for the academic year, 1969-70. 


Staff reactions to this first experience with selection procedures 
were predictably mixed. Nevertheless, the difficulty involved in 
finding a consensus for alternative procedures led to an extension of 
the existing arrangements for another year. During the spring and 
summer of 1970, McArthur staff (with student participation this time ) 
again conducted personal interviews at a number of Ontario centres. 

Of a total 1,081 applicants, over 500 met the academic criteria and 

were eligible for interviewing. The number actually interviewed amounted 
to 494; of those 494 interviewees, only 22 were not offered acceptances 
(giving a rejection rate of under 44%). After some offers of acceptance 
were declined, the net enrolment for the third academic year, 1970-71, 
was 337. 


It was during that second year's experience with the selection 
procedures that faculty concerns began to crystalize. Some staff 
members criticized the practice of according such a high weighting to 
academic grades. Others were disturbed by various aspects of the 
interviewing arrangements: the lack of a research base for the rating 


instrument, the subjectivity involved in the evaluation, and the total 
absence of training for the interviewers. By the spring of 1970 there 
was a prevalent feeling that the faculty should make the admissions 
policy “one of our serious research projects in the College", and, 
subsequently, the new Admissions and Standards Committee was charged 
with recommending a new admissions policy before January 31, 1971. 
Concurrent with the emergence of faculty doubts about admissions 
procedures, advice from the influential advisory committee (not to 

be confused with the early admissions committee) strongly supported 
retention of personal interviews. 


It seems important to understand why the opinions of concerned 
faculty, knowledgeable members of the profession, and informed lay 
representatives were divided. The most plausible explanation relates 
to the general state of the art in defining and predicting teacher 
effectiveness. Without a general consensus on what constitutes 
exemplary teaching behaviour, it must be difficult even for reasonable 
people to reach agreement on which indicators have the greatest 
predictive utility. That central problem is further compounded by the 
pressures of time and numbers. Faculties of education lack the 
resources to conduct an in-depth study of each applicant. They require 
a technology or a set of procedures sufficiently parsimonious to assess 
thousands of applicants within a few months and to determine which ones 
are the best prospects for teaching. Given the diverse perceptions of 
the ideal teacher practitioner and given the indeterminacy of most 
empirical studies in that area, it is not surprising that the Queen's 
Faculty of Education exhibited strong differences of opinion every time 
the issue of teacher selection was debated. Rather, it is to their 
credit that the faculty have chosen (1) to state their differences openly, 
(2) to recognize that the mixture of concern and uncertaintly frequently 
Signals a significant question, and (3) to mobilize energy and goodwill 
for a systematic probe of the question. 


At the same period a seemingly unrelated event occurred which was 
to provide the foundation for the subsequent investigation. On 
October 26, 1970, the Department of Educational Administration, 


The Ontario Institute for Studies in Education, sponsored a one-day 
"Seminar on Teacher Selection Simulation Materials", featuring Pro- 
fessor Dale Bolton of the University of Washington. Although Bolton's 
approach was intended for use with certified teachers, the Queen's 
Coordinator of the B.Ed. program, W.S. Peruniak, the senior author, 
envisaged that the methodology could be adapted to the selection 
purposes of the faculty of education. Furthermore, it was reassuring 
to note that Bolton's work was grounded on the extensive study of 
teacher characteristics conducted by Ryans (1960). After examining 
Bolton's materials and reading the background research, the Coordinator 
concluded that Ryans' and Bolton's work represented an unusually 
powerful approach to personnel selection, being based on a logical 
progression from rigorous empirical research to theoretical formu- 
lation and then to applied practice. 


yon. gf 6 Wsipanoge . ne ie MES 


~(v1 ¢ yh” 


a ripest) Sal cesta bate) es ns 


9 iwi: 


veuni 4 Ceahisza ita Y rr friar Pe Kile ‘snag 9, mn sis 
“uy z 


aa 


et 
PG tg 


Ie 


a 


rhe 
ALL no? 4 =? a Pigt oh 
ht th trod BF deen mi 


Py 7 he 
sou wt pba 
three at. alee sueroort b 7 ee an 10: 5 ti00) 
its ch Se ira dee i ean btn ‘adi Yo ane bal me 
ws). shorted it igh La ibs Me yetuent att 3 og 


rf = : * ’ i 
j Lites wht 64 bona) > Bee Ane ‘ma for 3 pie i | 
= «e P oes y 
Tas be ‘OORE) aud va ae toupe 3 7 ai zines aco artaned | 
7 7 , ; cs 
: at i -» 


; fi e828 PI. 4 TA BY! orpbay ‘ow elarrade b. SS morta " 


Tah 


ul oa 
ge 


4 tA 
é Add wary | { > ronnodrey we vei » ve 
ae 


bhi lO on adeed? Terese roche hi nores 


oa) ‘7e7 0 Adie he S nate Shs"? 
tg PORT 


CHAPTER ITI 


REVIEW OF SELECTION ISSUES 


An examination of research literature on selection soon leads 
one to consider classification and placement, which are types of 
decisions about individuals made by institutions. 


Cronbach (1971) has argued that selection is a type of classifi- 
cation problem where one of the classes of assignment is rejection. 
According to Cronbach and Gleser (1965), classification is the broad 
term to include a variety of people-assignment problems. It includes 
the assignment of people to different programs when there are several 
different predictor variables and several different programs. The 
assignment of people to different programs along a single dimension 
is a more restricted procedure called placement. 


The concern in this report is with the selection problem where 
decisions are either to reject or accept. Typically, in faculties 
of education, students have considerable freedom to choose their own 
set of courses within a given program rather than yielding to a 
placement procedure. If additional admissions requirements are im- 
posed beyond university level regulations for individual programs, 
the prerequisites tend to be minimal and associated with specific 
knowledge and/or skills. 


Arguments about the need and type of selection procedures are 
numerous and divergent. Each faculty of education in Ontario has a 
selection procedure based largely upon academic achievement in the 
institution previously attended. Institutional resources and the 
number of applicants largely dictate the extent of selection and 
admissions procedures. In most cases in Ontario, selection procedures 
are designed to select a quota of students from a group of applicants 
who generally meet the minimum requirements. The dominant criteria 
for selection of candidates appears to be prediction of success in the 
faculty of education academic program rather than the criterion of 


remedial courses to supplement existing programs. When making ad- 
mission decisions, an institution is faced with restricting itself 

to certain kinds of applicants where the number of applicants far 
exceeds the desired quota. If the number of applicants is relatively 
small in relation to the number desired for a particular program, the 
decision may be to recruit applicants. In recent years, applicants 
have far exceeded the number that can be admitted to programs. Given 
this situation, decisions are then required on what measures are desired 
for use in making admissions decisions, how these measures will be 
combined and weighted if more than one is used, and where the decision 
point(s) will be on whatever measure or measures are used. The Single 
score, be it from one variable or a combination of variables, is then 
used to decide whether an applicant will be admitted or rejected. 


Many policy decisions regarding the purpose of programs are 
assessed to determine who should be selected. Most academic insti- 
tutions appear to operationalize their policy to select those candidates 
they predict will survive their programs of instruction and ultimately 
graduate. In some cases, institutions behave as though they believe 
that even brief exposure to university programs is beneficial to 
individual students, even if there is a high probability of failure. 
Where institutional resources are abundant, some appear to believe 
that all persons should be given an opportunity to attend the desired 
program even if success is remote. 


Concerns expressed in the above statements are largely with the 
predicted probability of success. During the seventies, however, 
admissions policy has often reflected a matter of economics rather than 
education, as budgets reflected income that was directly proportional 
to the size of the student body. When economics were not a primary 
concern, the issue became the perceived "quality" of the graduate which 
would be reflected upon the institution as the students entered the 
world of work. In this situation, the deliberate intention is to change 
students into models of enlightenment, where the benefit to the 
institution is in the growth of the selected candidate. Whatever the 


determination, in order for the institution to select intelligently, 

it must take a stand concerning the kind of benefit it wants to maximize 
on the average in its institutional selection decisions. Unfortunately, 
however, many institutions have not made deliberate and explicit decisions 
about this basic issue. 


If the institution feels that its programs are fixed and determined 
by the graduate from a one-, four-, or five-year program, it will not 
offer remedial courses, Thus, staff would feel that it is unreasonable 
to attempt to accommodate itself to applicants who cannot be expected to 
accomplish the objective within the specified period of time. Such 
an institution would be unsympathetic to the idea of introducing remedial 
courses for students who were inadequately prepared, or introducing more 
courses in a sequence, resulting in a longer curriculum for those who 
could reach the objective but only through more time and effort than 
the optimum. The situation in this special case could relate, especially 
in Ontario, to those persons who appear to be academically competent 
but who have an inadequate grasp of the English language. Thus, one 
way to adapt the program to potential students would be to offer special 
courses in English as a second language. 


The sources of applicants may pose a special problem. On the one 
hand, one could leave the matter to chance and consider whoever applies. 
However, faculties of education in Ontario are supported by public funds 
and may feel an obligation or pressure to serve, primarily, those 
students who come from their geographical region. In the case of Ontario, 
there is no apparent mandate or obligation to students of one religion 
or sex, although it is of some considerable interest to have equal 
opportunity for females. The Ontario Teacher Education Colleges also 
feel a pressure to ensure that a desired proportion of selected 
candidates are affiliated with the Roman Catholic Church, as they will 
staff the schools designated as separate schools or Roman Catholic 
schools. Beyond these considerations, there is also the issue of whether 
the institution will actively recruit candidates. From the point of 
view of decision theory, even when available selection tests are not 
very efficient, utility can be increased through improved recruiting 


1] 


for a constant-sized student body. This permits use of a more stringent 
criterion of quality for the students who will be admitted (Cronbach 

and Gleser, 1965), and this lower selection ratio has the same effect 

on increasing average utility that a higher selection-procedure 

validity would produce. 


The Faculty of Education, Queen's University may also use its 
public image as a selection device. In a setting where the reputation 
is high, admissions policy and selection machinery provide the direct 
selection, but the public relations office provides indirect selection 
by controlling the image of the institution so that students select 
themselves in ways desired by the faculty of education. In one sense, 
direct selection can be dispensed with if the indirect procedure 
results in recruiting the kind of students the faculty wants. There 
are some problems, however, with indirect selection. Firstly, the 
faculty cannot tell very much about its selection ratio. Secondly, 
images are inherently more resistant to change than are institutions. 
They often lag far behind the reality of the campus. An admissions 
procedure that does not depend heavily on self-selection is much more 
controllable. 


Although vast experience rests with a registrar's office and many 
research studies have examined the measures used by Faculties of 
Education in making decisions, we seem to have reached a plateau 
regarding significant improvements in the use of various measures. 
However, a rational selection program rests upon the deliberate and 
informed choice of the material to use. A major consideration in 
choosing material for selection purposes is the amount of material to 
be gathered and processed directly, which affects the cost of making 
each selection decision. It is, indeed, an interesting exercise to 
determine when the amount and type of information received have no 
additional impact on selection and, hence, should not be included. 
The extreme to be avoided is when so much testing and information 
gathering is being conducted that the measurement program costs more 
than it is worth. Although one can analyse costs for various 


12 


components, it is a policy decision as to how much one should extend 
costs in relation to return from the investment. 


After selection of measures is completed, the next stage is to 
determine how the measures will be combined into a form from which 
consistent decisions can be made on applicants. The regression model 
and the cutting-score model are in common use, with individual 
characteristics included by each institution. Each model has different 
implications and produces different results. However, the usual 
situation is that such contrasting models will select the dominant 
group in common and differences between selection and rejection of 
applicants by each model varies increasingly as the "cut-off" score 
from either model approaches the decision point of accept vs reject. 
With regard to faculties of education and the relatively small number 
of applicants, many informal procedures are usually part of selection 
decisions. It seems that selection decisions are more frequently made 
by using a combination of a clinical judgment and statistical data. 


Following from the measures and an appropriate model to utilize 
in selection decisions is the concern with the location of the specific 
decision point or points on the admissions measure or measures--the 
point above which a candidate is selected and below which he is rejected. 
Many factors are considered in deciding upon an optimal point. A 
faculty of education may select this point on the basis of the desired 
number of students to graduate from this class. Other considerations 
could be on the basis of a predetermined level of minimal acceptable 
talent, or on the basis of a minimum probability of success that the 
faculty of education can expect from the selected group. Alternatively, 
in the case of the Ontario Teacher Education Colleges, the Minister 
of Education established the quota for the 1978-79 academic year. 
Similar action may result from policy bodies responsible for all 
universities within the province. In terms of edcuational selection 
being a form of placement, the deciding point or score can be the point 
at which rejection is more beneficial to society than acceptance 
would be. 


In some selection models the decision point is very precise, 
whereas in others there are essentially three groups: clearly accept, 


Life. 


decision to be made, and clearly reject. Such a "decision to be made" 
or decision area can employ additional variables, or random factors may 
even be permitted to operate. It is with this group that, in fact, 
decisions are made. Because the decision area group is at the cutting 
line, it provides an opportunity to involve a special additional set 

of measurements in the form of a try-out program, such as being accepted 
on probation or for a special trial period. An interesting way to 

view acceptance of applicants within the decision-theory frame of 
reference is that all acceptances in the admissions process are really 
temporary in that admission does not guarantee graduation. The candidate 
may fail during term exams and subsequently leave, may be counselled out 
by staff, or may decide to leave for personal reasons. 


An overview of the selection process was outlined above with brief 
mention of procedures in Ontario. The issues raised are common over 
institutions and only differ in their particular application. One 
source of information as to what has been useful is the research 
literature. 


The measurement process entails asking about criteria and 
standards, where standard in this case is reflected in the acceptance- 
rejection decision. For faculty of education programs, one criterion is 
that of a "good teacher". However, given the nature of teaching, little 
agreement can be achieved on determining fine distinctions between 
the quality of teaching of two graduates. Thus, we are left with the 
Situation where it is seldom possible to measure the criterion of 
interest directly. Faculties of education wish to balance their books 
in the dual responsibility to society and the student and, thus, tend 
to select grades in courses as an indication of success in a program. 
Whether grades obtained at university and ability to teach are 
directly related or not remains an open question having supporters on 
both sides of the issue. 


There are literally thousands of studies of the various kinds of 
measures for predicting academic grades. Regardless of whether 
academic grades measure personality attributes, effort, and other 
personal characteristics, grades have been chosen as the criterion of 


14 


success in school and university. Regardless of what one reads or 

hears about academic grades, especially regarding the goals of a 

faculty of education, institutions behave as though they valued most the 
students who obtain high grades. Indeed, grades are readily available, 
quantifiable, and are of importance in such decisions as being able to 
graduate. 


Although average grade is frequently used as a criterion in 
selection, it has been criticized on a number of grounds. Klein and 
Hart (1968) argue that grades are a poor representation of educational 
utility because they are contaminated with irrelevant factors such 
as diligence, handwriting, and general verbal ability, as well as a 
personal attractiveness and skill in interacting with the instructor. 

A second argument surrounds the variability of grading and of different 
grading systems. Interpretation of grades then becomes difficult over 
different institutions, their departments, and the general time in which 
the program was completed. In Ontario, for example, there is considerable 
concern over the fact that the secondary school grades have steadily 
become higher since 1967 when provincial examinations were abolished. 
Also, there has been a reduction in the number and type of required 
courses needed for graduation purposes in secondary schools and 
faculties of education. Hence, the lack of compatibility of programs 
taken by students has reduced the comparability of average grade from 
person to person and group to group. During recent years, it 1s 
frequently argued that the spread of grades has generally diminished 

and is so reduced that individual differences in average grade have become 
unreliable. Secondary school and undergraduate course grades have been 
criticized as being unreliable, but the general opinion of researchers 
who conducted the Ontario "Interface Study" is that grades remain a 
reasonable and stable indicator for the prediction of academic 
achievement. As a general conclusion, however, the empirical character 
and philosophical bases of grading practices at the secondary school 

and university level are implicity and unexamined, so that no one really 
knows how grades are being arrived at or how they relate to the goals 

of the institution. Although there is much rhetoric about achieving 


consistent policy and grading practices, there appears to be little 
systematic attempt to relate grades to the goals of faculties of 
education, and there do not appear to be consistent grading practices 
related to the concept of utility. 


Because of perceived difficulties with grading practices and 
expressed dissatisfaction, other individual or group data have 
occasionally been used. One source of additional data has been 
ratings by faculty members. Research with procedures in industry, 
civil service, and education have indicated strengths and especially 
weaknesses (Campbell and Fiske, 1959; Davis, 1964, 1966; Guildford, 
1954; Brown, Weinstein, and Wahlstrom, 1978; and Weinstein, Brown, 
and WahIstrom, 1976). When used in academic situations, interview 
ratings tend to be difficult to obtain, of low reliability, and highly 
redundant with grades. Various Ontario institutions such as the Queen's 
Faculty of Education and the Ontario Teacher Education Colleges have, 
in recent years, attempted to overcome some of the difficulties with 
interviews and have embarked upon a course to improve data from this 
source. Although some professional groups use certification by an 
external agency as a further means to distinguish between expected 
performance levels within the profession, the current certification 
procedures in Ontario are generally based upon information about 
graduation from an institution rather than upon an additional examination. 


There has been a trend in education to initiate competency-based 
testing for certification of teachers, but this work is still at the 
experimental stage and has had virtually no influence upon certification 
or graduation requirements in Ontario. Some information is collected 
on application forms about accomplishments in non-academic areas, such as 
prior work experience, but the type of information is often unique to 
an individual and is entered in a clinical manner rather than a statistical 
analysis to help determine an applicant's suitability for entering a 
teaching pre-service program. In the special case of faculties of 
education, applicants frequently are young and have not had an opportunity 
to gain non-academic work experience. Accordingly, it is suggested that 
they should not be penalized for this lack of opportunity by making this 


16 


type of activity a requirement for admission. A further consideration 
could be persistence until graduation. That is, in making admissions 
decisions, there has been concern about no-shows and in having selected 
candidates drop out once admitted. To date there is minimal evidence 

as to who will be a no-show and who will drop out, but these are 
important considerations and are currently being investigated at the 
Queen's Faculty of Education and the Ontario Teacher Education Colleges. 


A review of the research literature provides evidence about 
decisions to operate a fixed-treatment program or an adaptive-treatment 
program. Given that faculties of education tend to operate upon the 
assumption of a fixed-treatment program, and if we further assume a 
linear relationship between pay-off and predictor variables, then 
Cronbach and Gleser (1965) have shown that it is desirable to test 
at least twice as many bona fide applicants as will be accepted, if the 
predictor variables are worth using at all. However, this is not the 
case in an adaptive-treatment program. In practice, and independent of 
the type of program officially offered at an institution, faculty members 
as individuals frequently alter their programs to suit the people they 
find in their classes and, in essence, introduce a form of adaptation. 
The consequence for a faculty of education is that treatment of students 
within programs will have important bearings on the use of predictor 
variables and on the optimum selection ratio to maximize gain in 
utility to be obtained through considering predictors at all. 


A faculty of education's sources of applicants are factors to 
consider in developing admissions programs. One can maximize the 
selection ratio with known information if applicants come from a 
variety of sources, but this procedure may be inefficient if a faculty 
serves a local area and admits nearly all applicants. If the latter 
is the case, funds spent on admissions purposes simply may be wasted. 
If an institution does recruiting, it is important that the nature of 
the institution not be misrepresented or exaggerated as this may 
increase the number of dropouts later. 


A general consideration in using prediction measures TSeeuhce aT 
seeking the greatest gain in utility by using the least expensive 


Ww 


measures that yield comparable results. That is, if useful predictions 
can be made from data that are collected for other purposes and, 
therefore, add nothing additional to costs, such data have the potential 
of providing appreciable gain. Thus, one should consider the readily 
available data as predictors. The second recommendation is that 
predictors cannot be considered in isolation from each other. Each 
additional measure has value only to the extent that it adds validity 
beyond that provided by the initial predictor, and each additional 
measure must be evaluated by its ability to improve upon the existing 
team or by the incremental validity (Sechrest, 1963). 


A transcript is usually available describing an applicant's 
success in previous programs. This readily available document pro- 
vides a data base for prediction that has had a long history of success. 
Many studies have demonstrated that secondary school grades are good 
predictors and that secondary school achievement is the most valid 
single predictor of performance in university as reflected in the 
criterion of university grades. Such a finding is to be expected 
Since secondary school activities are in essence a work sample and are 
similar in form to the work expected at universities where the criterion 
to be predicted, namely academic grades, is reflected in previous 
experience. When using measures for prediction in selection decisions, 
one must remember that measures can have a positive correlation 
because of similarity in form as well as similarity in content or to 
dependence upon common underlying attributes. Accordingly, ratings 
might tend to predict ratings better than test scores; test scores should 
be the best predictors of test scores and, by the same argument, grades 
Should be the best predictors of grades, as typically they are. Thus, 
in establishing an academic selection system, the first kind of a 
predictor to look for is some readily available work sample from previous 
performance. The best work samples usually will be the ones closest to 
the criterion in time and form. If the Queen's Faculty of Education 
admissions committee is concerned with academic success in their program, 
which follows an undergraduate degree, one of the best predictors for 
Queen's surely is the previous academic record. However, the matter is 


18 


generally not as simple as this because considerable concern is with 
prediction of success in teaching. Because the predictors are now 
significantly different from the criterion, and what constitutes good 
teaching is ill-defined, the ability to achieve a desired level of 
predictability is less than the institution desires. By way of noting 
that this is an important issue, Queen's Faculty of Education has 
systematically assembled admissions data from the 1971-72 class and has 
followed up on those who remain in teaching to determine the relationship 
between data available during the students' program year and employment 
data after several years of teaching experience. However, when grades 
at university are the criteria for success in a program, one of the 
best predictors will be previous university grades. . If there is a 
desire to improve on the level of prediction available from the con- 
veniently accessible work sample, one should evaluate carefully the 
increment to validity that is obtained in using additional data. 


When using academic records of performance from a variety of 
institutions, it is expected that different standards will have a 
differential effect upon predictions of university success. Such 
was not the case in using unadjusted secondary school grades and test 
scores as predictors. That there is no demonstrated influence was 
reported by Lindquist (1963) and Linn (1966). The rationale is that 
test scores serve in a regression equation to counteract the influence 
of differential grading standards in secondary schools where, in effect, 
they equate the secondary school grades from various schools. Evidence 
such as this supports a recommendation for introducing a testing 
program for admissions purposes when applicants come from a variety 
of institutions having different procedures and standards for 
assigning grades. Just as secondary school records and aptitude test 
scores make an excellent prediction system (Hills, Gladney, and Klock, 
1967), so should university transcripts and scores from a testing 
program. Whereas the practice of rank-in-class has decreased in 
Ontario schools, rank-in-class and secondary school average serve about 
equally well as predictors, especially when they are used in conjunction 
with other predictors such as admissions test scores (Hills, 1971). The 


us 


distinct advantage is that schools do not need to wait for Students to 
complete a grade or to graduate before a reasonably complete represen- 
tation of secondary school performance can be made. A marked difficulty 
arises, however, in the case of large secondary schools, as the procedure 
is only optimally effective if all students are ranked. 


In addition to providing a transcript, an applicant generally 
completes an application form requesting personal data. Although the 
data may be used for specific purposes, such as to identify persons with 
specific talents, this source of information for academic selection 
often lies untapped. Opinions and results are mixed regarding the value 
of such data and some methodological problems are often overlooked in 
studies using biographical data. Hilton and Myers (1967) have noted 
that incremental validity of a new predictor is important if it is to 
be of practical significance. As well, any variable must meet the 
conditions imposed by cross-validation where the stability of a variable's 
contribution to the prediction is assessed. In the special case of 
biographical data, where selected items have been determined empirically 
from an initial sample, cross-validation is especially crucial. 


Although cross-validation is important, it places second to 
operational validity being diminished. Knowledge about tests, 
admissions procedures, and prior decisions based upon such information 
may affect the way persons apply and may even affect who is willing and 
interested in completing an application. 


One of the difficulties of using biographical data is that what 
works in one situation may not work in another setting. Thus, it is 
difficult to recommend any specific set of such data that might be 
relevant to a given situation. A further concern is that there is no 
theoretical framework for dealing with the structure and associated 
data base. 

A methodological issue of considerable concern is when the effects 
of prior selection are not included in studies that report satisfactory 
incremental validity. When previous grades and test scores have been 
used in the selection of candidates on whom the validity of biographical 


20 


data was examined, unused variables will tend to appear superior in 
predictive efficiency due to the restricted range of the variable 

already used in selection. Detailed reports by Horst (1954, 1955) 

and Lunneborg and Lunneborg (1966) outline absolute and differential 
prediction settings utilizing biographical data as predictors. Only 
limited facets of using biographical data have been investigated although 
hundreds of studies have been reported. Studies tend to duplicate the 
work already done and there is a need for extensive methodological 
developments prior to any breakthrough in using such data. 


Use of biographical data as non-intellective predictors appears 
enticing at first glance, if one attempts to relate current non-intel lec- 
tive factors to a future attainment in this area, because it is an 
additive component to the largely academic prediction model. However, 
lack of conclusive evidence for such variables, together with obvious 
methodological flaws in several studies using this type of data base, 
makes one cautious about a positive recommendation for routine use in 
selection decisions. At this point in time, it is also difficult to 
suggest any meaningful path to explore that should lead to success. 


Whereas studies on biographical data for prediction purposes have 
largely been negative, evidence from work on previous academic performance 
and scores on admissions tests have often been found to be useful. 
Typically, admissions tests are closely related to success criteria and, 
hence, are deemed to be extremely relevant for admissions purposes. One 
of the most common procedures is to use a test battery containing verbal 
and mathematical aptitude subtests. It is common knowledge that these 
two sets of scores are optimal areas for deriving a measure of 
intellectual competence associated with performance in academic programs. 
Horst (1966) reports that combining previous academic performance and 
these two scores results ina large proportion of criterion variance being 
accounted for in prediction equations. Although reported studies frequently 
cite use of either the Scholastic Aptitude Test of the College Entrance 
Board or the American College Testing Program, it is suggested that 
similar incremental validity can be obtained when using any of a wide 
variety of aptitude tests commercially available. If one wishes to add 


2] 


achievement test scores to those of previous academic performance 

and aptitude test scores, one will generally find a modest or incon- 
sequential gain in multiple prediction models. A partial explanation 
is that there is not a clear distinction between aptitude and achieve- 
ment test results of young adults. That is, it is difficult to measure 
a university graduate's intelligence uncontaminated by previous aca- 
demic experience and training. Prediction equations using such data 
are generally adequate for use in immediately succeeding years, but 
one should check the beta weights after two or three years to ensure 
continued operational validity as the relative weights may shift due 
to different emphases in previous academic performance and the cur- 
rent program at a particular university. 


Evidence from the research literature suggests that a sound 
educational selection system should include an evaluation of past 
performance, such as reflected in grades, and, secondly, have added 
to this evidence one or more scores from tests of academic aptitude 
and achievement. Addition of procedures and data beyond that des- 
cribed have been tried with little consistent evidence of incremental 
validity. As noted, biographical data has been used with varying 
degrees of success. A very common procedure that apparently has high 
face validity in faculty of education admissions is to have each appli- 
cant interviewed by a representative of the institution. Evidence on 
the value of interview findings as incremental predictors is very dis- 
couraging and has been for years (Cronbach and Gleser, 1965, Kelly, 1954; 
Weinstein, Brown, and Wahlstrom, 1976). It seems that because we have 
so much personal faith in interviews we are reluctant to accept evidence 
to suggest that such procedures are inadequate. Consequently, con- 
siderable effort has been devoted to an analysis of the interview pro- 
cess and its role in selection decision. One specualtion is that the so- 
called unstructured interview is relativey lacking in efficiency when 
compared to the structured interview. In essence, the structured inter- 
view is similar to a test and takes on many of the positive attributes 
of testing which would suggest that results from a structured interview 
for a selection process would illustrate a significant improvement over 


22 


those from unstructured settings. A tentative recommendation is that 
one should provide in-service training for interviewers and that 
interviews be conducted using a structured rating system rather than a 
general discussion approach. General findings from studies of the 
interview process are that (a) decisions that are made in the selection 
interview are usually made within the first 30 seconds of the interview, 
even though an average interview time period is 15 minutes, (b) early 
impressions are very important, (c) the interview tends to be a search 
for negative evidence, where anything unfavourable that appears in the 
interview is likely to lead to rejection, (d) interviews often become 
sales pitches during the latter minutes when the interviewer has already 
decided to recommend acceptance of the applicant, but if rejection is 
the decision, the interviewer is less pleasant, and (e) interviewers 
tend to differ in their response styles, particularly in their category 
widths, i.e., their tolerance in accepting candidates. Because of the 
unstructured nature of many interviews, personal idiosyncrasies of the 
individual are allowed to influence interview ratings and, hence, may be 
a major explanation for the differences between the decisions of 
different interviewers. When considering the final decisions that will 
be made from interviews, it appears that unfavourable impressions are 
much more important than favourable impressions. As in the marking of 
essays, interviewers are sensitive to adaptation level where a poor 
applicant tends to make the applicant who follows look good and, conversely, 
a good applicant handicaps the person following. One measurement perspec- 
tive is that increased structure and greater concern and control over the 
interview content would see a significant improvement in interview 
results. Whereas the focus tends to be upon reliability and content 
validity, the greatest need is to establish construct validity in 
interviews for incremental predictive purposes. Implicit in these 
statements is the question of the most relevant criterion. If one is 
selecting on the basis of academic performance as the criterion, then 
previous academic grade average should be one of the most useful 
indicators. However, if one wishes to use a different criterion, such 
as high quality teaching performance, then an interview may yield "good" 


23 


results. It is here argued that the interview can be useful as a 
selection tool, especially if greater attention is paid to psychometric 
considerations. However, at present the contribution of the interview 
is largely beneficial as an in-service program for faculty and to public 
relations. It appears that few negative categories arise with the 
interview apart from time and cost considerations. 


Cronbach and Gleser (1965) have argued that application of wideband 
procedures may justify continued use of interviews. The principle is 
that information of modest validity used in many decisions may be more 
beneficial than information that is highly relevant for but one decision. 
A dominant feature of the interview is that it is a viable procedure 
for obtaining information on a variety of topics and great variety can 
be introduced without increasing the cost proportionately. By means of 
such an argument, one could justify an interview over a long admissions 
test relevant for predicting academic performance. The weak point in 
such a statement is that interview information must be systematically 
disseminated and used if it is to be beneficial. Perhaps more of the 
unanticipated gain from interviews occurs through the in-service 
phenomenon that is fully appreciated. From an economic and efficiency 
perspective, the interview has low utility for most admissions decisions. 
However, from the writers' perspective, it would be most interesting 
to analyze institutional actual dollar expenditures in situations using 
interviews and not using interviews. Also, one would wish to examine 
the productivity gain within a faculty if their time were devoted to 
other activities. Although such a study may be difficult to conduct and 
it may not be easy to quantify the different sets of outputs, such an 
analysis may reveal greater benefits from interview procedures than is 
normally claimed. 


References and ratings of qualifications or personal characteristics 
are routinely required by many educational institutions and in many ways 
are similar to the interview. There is minimal evidence in the research 
literature on the effectiveness of such data. As such procedures are 
generally of a heterogeneous nature and do not specify the characteristics 


24 


being rated, there is no empirical basis for expecting ratings to 
provide appreciable incremental validity. Gough, Hall, and Harris 
(1963) and Smith (1966) report disappointing results in the effective- 
ness of such data for admissions decisions. 


Just as there has been a glimmer of hope associated with increased 
use of biographical data and interviews, much effort has been 
expended upon incorporating personality variables into selection 
procedures. Chauncey and Frederikser (1951) suggested that the 
greatest advances may come through a thorough exploration of the 
measurement of personal qualities. Little has apparently been 
accomplished since that early statement was written. Smith (1966) 
has cautioned us against optimism and it is difficult to see where 
any one of the several lines of effort has moved us forward over the 
past two decades. It seems logical, especially in the domain of teaching, 
that temperament and personality should play an important role in 
academic performance and should indeed influence how a person performs 
as a teacher. However, there is little encouraging evidence to report. 
According to Lavin (1965), there is no sign of the "big breakthrough” 
here, despite a great amount of effort being devoted to the problem. 
An explanation offered is that many of the criteria used in selection may 
share elements of personality measurement and, hence, do not contribute 
a significant variance component in prediction equations when used as 
separate scores or variables. 


Different studies incorporate varying terminology, such as the 
use of work samples for prediction purposes. For purposes of a data 
base to be used by faculties of education, the work sample is essentially 
represented in the secondary school and university academic record and is, 
perhaps, the best work sample possible. Persons using academic records 
should remember that three- and four-year university programs , summarized 
in a one-page transcript, are massive data bases and should not be 
considered lightly in any decision process. In fairness to admissions 
officers, however, we acknowledge that data other than previous academic 
performance is desired in order to have a broad data base. 


By way of a summary of non-cognitive predictors, it seems that only 
minimal gains in prediction above and beyond what is attainable via the 


25 


use of intellective predictors alone have been demonstrated. We continue 
to seek an expanded data base for prediction purposes, and continually 
argue that components other than the academic are relevant, but our 
success in this area has been less than our rhetoric would suggest. 


The review of issues related to the selection problem has been 
largely based upon the work of Hills (1971) and is freely adapted for 
this report. Because the number of studies is literally in the thousands 
and many excellent reviews are available, it was deemed appropriate that 
a modification of existing material would be more valuable than another 
review at this time. Thus, we gratefully acknowledge the extensive use 
made of Hills' reports and suggest that the material presented here is 
more comprehensive given this particular approach. 


The review of materials presented dealt with the type and nature of 
variables usually considered in admissions decisions. Once the variables 
have been chosen, however, the core of the process begins. Now the 
admissions office must decide which students will be accepted and which 
rejected. These selection decisions are usually made by one of three 
common methods: (1) allow an admissions officer to study the applicant's 
record, weight the evidence as clinical judgment dictates, and arrive at 
an admit or reject decision; (2) establish cutting scores on each variable 
as a basis for the admit/reject decisions; or (3) combine two or more 
variables into a single score, usually by a linear combination, and 
admit the applicants with the highest composite score. 


The first alternative, Clinical judgment, is probably the most widely 
used selection method. It uses human judgment to integrate applicant 
data, as well as to select the variables for consideration. It allows 
consideration of each applicant as an individual case. It avoids the 
time-consuming calculations common to many mathematical decision-making 
techniques. To its detriment, however, it has been repeatedly demonstrated 
that mathematical methods make more accurate decisions than admissions 
personnel. Clinical judgment is also criticized because it does not 
allow decisions to be made openly by explicit and defensible criteria. 
Thus, clinical decisions tend to obscure institutional selection goals 


26 


and promote the quality standards of the individual decision maker. 


The second alternative, use of multiple cutting scores, has two 
forms. In its most basic form, the institution sets cutoff scores on 
each variable and then selects those with scores above all cutoffs. Quite 
often cutoff scores are set arbitrarily and yield an unexpected number of 
enrollees. There are several methods, however, for setting cutoff scores 
properly. These methods may take account of the cost of gathering 
information about the applicant, the validity and reliability of infor- 
mation obtained, and the desired enrolment of each program. Cutoff deter- 
mination procedures can also be used in the more complex model. In 
that model, two cutoffs are set for each variable. Those below the first 
cutoff are rejected, those above the second cutoff are accepted, and 
decisions about the remaining group are made by clinical judgment or 
linear combination. This technique allows institutions to quickly 
identify and accept the most desirable students, and permits establish- 
ment of minimal standards for admission. It also helps cut admissions 
costs while preserving or enhancing the quality of decisions made. 
Multiple cutoffs are easy to compute, to administer, and to explain to 
laymen. The separate scores involved provide a valuable basis for 
guidance counselling. The strict cutoff approach assumes that 
compensation between needed skills is not feasible. With the two-cut- 
off model, this assumption is tempered by the assumptions of the method 
used to deal with the undecided group. It is preferable that the 
assumption of noncompensation be moderated, since experience suggests that 
jn education, and especially in institutions with varied curricula, lack 
of one kind of talent can often be compensated for by possession of 
another relevant talent. 


The third method, linear combination, also has several forms. None- 
theless, one may say that linear combination assumes skill compensation. 
That is, a high aptitude of one relevant kind can compensate for lower 
aptitute in another relevant skill. Linear combination methods, in 
general, have relied upon computers to provide highly sophisticated 
multiple regression equations as a basis for a composite score. These 
methods provide exact solutions in a statistical sense, but generally 


fag 


one finds that the regression weights can be simplified dramatically 

with little practical loss of predictive accuracy via "hand" or non- 
machine solutions. Multiple regression, either with a sophisticated 
computer system or via a nominal method readily derived using routine 
office procedures, is recommended in an appropriate form. Technical 
advice, however, is also recommended when an admissions office is using 
such a system, as both statistical and measurement problems frequently 
are not recognized by admissions personnel and thus allow unintended 
assumptions to be incorporated into the admissions model. A trivial 
example is that different means and variances for each variable can 

serve as unintended weighting factors and have an undesirable effect upon 
decisions regarding acceptance and rejection of applicants, especially 
for those near the cutoff score. It should always be remembered that all 
of these selection methods are limited by the quality of the predictor 
variables which they use. Furthermore, the decision-making procedure 
should be chosen which can best serve institutional goals -- the tail 
shouldn't be allowed to wag the dog. 


A frequent practice when various kinds of data such as test scores 
and biographical data are assembled is that a human judge makes a 
decision as to which candidates will be rejected. Such a person may 
have access to statistical summaries and the candidate's file and, 
although the issues are not resolved fully, it seems clear that the 
mechanical statistical combination of data is practically never improved 
upon by modification by human judgment (Sawyer, 1966). However, if 
judges or interviewers are to be used in the selection process, present 
findings suggest that the data generated be introduced into prediction 
equations or other statistical combination procedures that generate the 
final decision. 


Persons interested in implementing an admissions procedure will need 
to examine the various kinds of benefit, criteria, and possible measures 
for making decisions. Accordingly, this will lead to the setting of 
cutoff scores that may be examined from the perspective of classical 
(predicting categories for measurements) and decision theory (gain in 
utility) procedures. 


CHAPTER IV 


GENESIS OF THE RATING INSTRUMENT 


The theoretical foundation for the interview-rating instrument 
was outlined in Characteristics of Teachers, where David G. Ryans 
States that the Teacher Characteristics Study, which he headed, 
represented "one of the most extensive research programs that has been 
directed at the objective study of teachers" (p. 6). Over a period of 
six years (1948-54), approximately 100 separate research projects were 
carried out, involving more than 6,000 teachers and 1,700 schools in 
450 school systems across the United States. 


One of the principal concerns that prompted the research was 
“the need for procedures for appraising certain characteristics of 
prospective teachers before or during pre-service training and at the 
time of employment by school systems to help improve teacher selection 
and assignment" (p. 9). 


A major objective of the study, growing out of that need, is 
described as follows: "The identification and analysis of some of the 
patterns of classroom behaviour, attitudes, viewpoints, and intellectual 
and emotional qualities which may characterize teachers" (p. 9). 


The general procedure of the study consisted of the following 
phases: (1) extensive observation of the classroom performance of 
teachers, resulting in a collection of more than 500 "critical incidents" 
or significant teacher acts; (2) subsequent selection of relevant first- 
order teacher behaviour dimensions from that assembly of "critical 
incidents" for inclusion in new observational instruments; (3) the 
assessment of the classroom behaviour of large numbers of elementary 
and secondary school teachers, utilizing those instruments; and 
(4) statistical analysis of the teacher behaviour assessments leading to 
the identification of some major patterns of teacher behaviour, as 
observed in the classroom. | 


The assessment form, which came to be known as the Classroom 
Observation Record (C.0.R.), underwent numerous revisions. As mentioned 


28 


fA 


above, the behaviour dimensions chosen for inclusion in the Gc. 

were derived from the lists of significant teacher behaviours or 

"critical incidents." A basic methodological assumption in the 
assessment record's format was the hypothesis that personal-social 

traits may be conceptualized as bi-polar dimensions, the opposite poles 

of which can be described with reasonable precision. The final version 
of the record utilized a seven-interval scale for rating those dimensions. 


It is instructive to note that the researchers attempted at an early 
stage "to reduce the assessment procedure to the tabulation of the 
frequency with which each of the specific behaviours was observed for 
a particular teacher (p. 84)." This mechanical scoring system, however, 
had to be abandoned in favour of a more impressionistic procedure. Ryans 
explains: 

A number of scoring systems for such a check-list 
approach were derived and tried out, but the 
technique proved cumbersome and less reliable, as 
judged by interobserver correlations, than the 
earlier-used estimation procedure. The apparent 
objectivity of a check-list approach makes it 
particularly attractive, and the results obtained 

in this type of attempt to assess teacher behaviour 
was (sic) disappointing to the staff of the Study. 
Use of check-lists, therefore, was discarded in 
favour of the more intuitive procedure, standardized 
and controlled through use of the Glossary and by 
training of the observers, but nevertheless relying 
upon a less objective summing-up of specific 
behaviours in arriving at an assessment (pp. 84-85). 

This excursion into methodological history was prompted by the 
unexpectedly high degree of inter-rater agreement revealed in the 
Queen's University procedures, this despite the lack of rigorous 
attention to ensuring standardized observer performance. It is at 
least conceivable that, in view of Ryans' indifferent success with 
categorical check-lists, there is some unrecognized basis for consistency 
in the intuitive approach to behaviour assessment when utilizing 
carefully defined bi-polar dimensions and when employing an alert, 


educated group of observers (even if technically untrained). 


To return now to the general procedures of the Ryans study, once 
the data provided by observers' assessments had been gathered, two 


30 


independent factor analyses were undertaken -- one on the intercorrelations 
of assessments of 275 elementary school teachers, and the other on the 
intercorrelations of assessments of 249 secondary teachers. 


With reference to the assessments of elementary teachers, Ryans 
writes (p. 97) 


Product-moment correlation coefficients were computed 
among twenty-four of the dimensions. The resulting 
table of intercorrelations was factor-analyzed by 

the centroid method and both orthogonal and oblique 
rotations were attempted. Five centroid factors were 
extracted. Orthogonal rotation of these factors did 
not yield an acceptable solution. The oblique factors, 
however, provided a solution that more satisfactorily 
met the customary criteria of simple structure. 


Following a similar procedure for secondary school teachers, Ryans 
extracted six factors from the 25 variables involved. 
Three factors were common to both elementary and secondary teachers: 


These primary teacher classroom behaviour patterns 
were designated as follows: TCS Pattern Xo, 
reflecting understanding, friendliness, and 
responsiveness vs aloofness and egocentrism on 

the part of the teacher; TCS Pattern Yo, reflecting 
responsible, businesslike, systematic vs evading, 
unplanned, slipshod teacher behaviour; and TCS 
Pattern Zo, reflecting stimulating, imaginative, 
original vs dull, routine teacher behaviour. 
Practical experience as well as the empirical data 
indicate that these are three of the principal areas 
involved in interpersonal relations, and that they 
might well be given basic consideration in the 
theory of teacher behaviour and also in teacher 
personnel procedures (pp. 102-103). 


The simulation materials developed by Dale Bolton for the training 
of educational administrators in teacher selection are based on a 
sophisticated model of decision making. The following statements 
(Bolton, 1970) will give a general idea of the approach: 
If all information about the teacher applicants 
is collected and then a single final selection 
decision is made, the data might take the form 


of multivariate information collected to predict 
various behavior which have varying utility in 


3] 


relation to some institutional goal. 

The tasks of the person who selects teachers 

include: (a) collecting reliable information, 
(b) using this information to predict consequent 
behaviors of the teacher, and (c) relating these 
behaviors to the operation of the organization 
so that a prediction of the total utility of the 
individual to the organization can be obtained. 

These tasks are necessary to ascertain the 
relative merit of each applicant for a specific 
assignment. In addition, of course, the decision 
maker must determine how many (if any) of the 
applicants should be hired at a particular time. 
This decision depends on the quota to be filled 
at the time, the quality of the applicants being 
considered, the probability that additional 
people will apply, and the probable quality of 
additional applicants. The number of additional 
applicants and their quality are related to the 
time of the year (p. 3). 

All decisions involve a consideration of the 
state of nature and a choice among alternatives. 
The choice is made on the basis of predicting 
(attaching a probability to) the consequences of 
the various alternatives and then assigning a 
value to the consequences predicted. It is the 
combination of the probable occurrence of an 
event and the value of the event that provides a 
utility measure for an alternative. The selection 
materials provided include the state of nature (the 
description of the situation), various alternatives 
(the applicants), a requirement to predict 
consequences (on the Teacher Evaluation Instrument) , 
a value system (the explicit criteria for selection), 
and a choice (rank ordering of applicants) (p. 8). 


In general, the decision-making process is based on two broad 
categories of data inputs: (1) descriptive information and (2) predicted 
outcomes. Each of these two categories operates within two domains -- the 
personal and the situational (or organizational). The relationships 
between these four aspects may be represented by a four-cell matrix as 
in Figure 1. 


32 


DATA INPUT 


DESCRIPTIVE PREDICTED 
INFORMATION OUTCOMES 


PERSONAL 


DOMAIN 


SITUATIONAL 


Figure ] 
Relation of Information and Predictions to Personal and 
Situational Variables in the Selection of Applicants. 


Extrapolating from teacher selection to the selection of student 
applteants, it becomes evident that there are many fewer variables to 
guide decision-making in the latter case. For instance, in the situational 
domain the variables are either common (such as, the demands of the 
B.Ed. program) or unknown (such as, the demands of particular positions 
in specific schools where graduates will ultimately find employment) ; 
being a constant in either case, this entire set of variables provides 
little help in discriminating between subjects. Likewise, the 
information category of the personal domain turns out to be of marginal 
use because all applicants must pass an initial screening of their 
academic admissibility. Once their scholastic eligibility has been 
established, the information category can provide very little data for 
justifying the admission of some and the exclusion of others. Figure 2 
indicates that only the prediction of personal behaviour remains as a 
Significant basis for discriminating between applicants. 


a3 


DATA INPUT 


DESCRIPTIVE PREDICTED 
INFORMATION OUTCOMES 


Considerable 
basis for 


discrimination 


Some basis 


PERSONAL for 
discrimination 


DOMAIN 
Little basis 
for 


Little basis 
SITUATIONAL for 


discrimination discrimination 


Figure 2 


The Utility of Different Variables in the Selection of 
Student Applicants to a Faculty of Education 


For the prediction of consequent teacher variables, Bolton devised 
a machine-scorable teacher evaluation instrument which requires the rater 
to predict how the applicant will be evaluated at the end of one year of 
teaching in a pre-specified teaching post. Forty-nine bi-polar behavioural 
dimensions (e.g., stimulating-dull) are organized under nine "teaching 
acts" or professional contexts: planning and organizing classwork, class- 
room management, creating a motivational environment, instruction, eval- 
uation, guidance and counselling, out-of-classroom professional activities, 
relations with staff and parents, and school-community relations. Scoring 
may be done in two ways. One way is to tally the scores of the scales in 
each of the separate role activities, thus obtaining nine professional 
role ratings. Another way, however, is to assign each of the 49 items 
into one of five major dimensions derived from Ryans' study. For this 
scoring approach, Bolton provides five templates representing organization, 
sociability, originality, empathy, and buoyancy. By adding the scores 
showing in the windows of each template, one can obtain summary ratings 
for the five generic factors, called important attributes. 


34 


The influence of Bolton's work on the Queen's interview rating 
instrument was very marked, for he had operationalized Ryans' empirical 
findings into an attractive and useful format for the real world of 
decision-making. However, because his materials were intended for use 
with certificated teachers (not applicants for: teacher training) and 
were designed specifically for elementary teachers (not secondary school 
teachers), some considerable modification, as well as simplification, 
of his materials were required. 


The final version of the Queen's Interview Assessment form (Figure 
3) utilizes 30 bi-polar scales. Bolton's professional role contexts 
have been removed as a sharp focussing device and appear at the top as 
a general statement of instructions applying to the entire instrument. 
Because Queen's lacked matching scoring facilities, the instrument 
was redesigned for keypunch coding. Reducing turnaround time to an 
absolute minimum represented a major concern in all these plans since 
the prime function of the interview assessment forms was to serve 
pressing administrative purposes rather than research purposes alone. 


It was hoped that, by requiring observers to make decisions about 
30 discrete dimensions on the basis of 210 choice-points, some of the 
stereotyping that frequently characterizes personal interviews, part- 
jcularly hurried ones, might be counteracted. Furthermore, the 30 
jtems were arranged randomly so as to mask the six important attributes 
which they exemplified; namely empathy, organization, buoyancy, origin- 
ality, leadership, and professional impression. The first four of these 
terms are the exact names Bolton assigned to his clusters of dimension. 
The other two attributes, leadership and professional impression, were 
original in nomenclature and in many of the component items. Figure 4 
illustrates how the 30 items were classified into six attributes and 
indicates as well their source of origin. Given its bi-polar construction, 
each item consists of two terms, yielding a total of 60 terms for the 
30 items. Of these 60 terms, 21 came from Ryans, 20 came from Bolton, 
and 19 came from Peruniak that were used to develop the final instrument 
presented in Appendix 2. 


Two other features of the instrument should be mentioned. (1) For 
purposes of checking the observers' consistency and the instrument's 


ate 


INTERVIEW ASSESSMENT 
DAY MO. YEAR 
APPLICANT: Student Number Ce Sora Sex a[_| Date ee as a 
inital tre|> 5] sattemey MALL 1M LN Lie “Von tlgmmene Ph OT) | 
{ 3 | 4 
INTERVIEWER: No SGT erat at Status 39 [ | (Code, T Teacher, A Administrator, F - Faculty,S - Student) 
Signature _ ay Bo Ne fetes Sache See 
PERSONAL DIMENSIONS 
Indicate the rating for each dimension by marking an ‘X’ in the appropriate box on each line. 
Since these dimensions should be considered in some professional context, the interviewer should estimate 
the candidate's potential in the light of these common professional roles: 
As a manager of classroom instruction As a member of staff & the schoo! community 
As a guide & counsellor of students As a member of professional associations 
Aloat 40 Responsive 
Dull 41 Stimulating 
Disorganized 42 Systematic 
Fuzzy 43 Precise communication 
Irresolute 44 Authoritative 
Rigid 45 Adaptable 
Autocratic 46 Democratic 
Aimless 47 Purposeful 
Pessimistic 48 Cheerful 
Evading 49 Responsible 
Unimpressive 50 Personal magnetism 
Suspicious 61 Trusting 
Easily upset 52 Self-possessed 
Inflexible 53 Opén-minded 
Uninspiring 54 Challenging 
Critical 55 Kindly 
Humourless 56 Sense of humour 
Distracting 57 Pleasant voice 
Lethargic 58 Vigorous 
Inconsistent 59 Consistent 
Hindering 60 Helpful 
Unimaginative 61 Resourceful 
Expréssionless 62 Expressive 
Antagonistic 63 Co-operative 
Dependent 64 Self-reliant 
Stcreotyped G5 Original 
Irrational 66 Rational 
Apathetic 67 Alert 
Unintelligible 68 Fluent 
Retiring 69 Forceful 
A ORDER OF IMPORTANT ATTRIBUTES (Rank 1 to 6) 
Empathy 70 
Oranizations. 21 (1 is the attribute which is most 
Leadership 72 evident, and 6G is the attribute 
Buoyancy 73 which ts least evident.) 
Originality 74 
Professional Impression 75 
OVERALL IMPRESSION (MARK WITH ‘X’‘) 
1 ed 4 5 (irs 
Unsuitable 76 t T | | Outstanding 
COMMENTS: 


Figure 3 


Queen's Interview Assessment Form 


RIBUTES . 


PATHY 


GANI ZATION 


EADERSHIP 


UOYANCY 


RIGINALITY 


ROFESSIONAL 
MPRESSION 


nw 
I 


Ryans 


14 
22 
26 


4 
el. 
13 
18 
29 


Classification of Thirty Queen's Interview 
Assessment Form Items by Attribute 


36 


ITEMS 
autocratic : democratic 
suspicious : trusting 
Sritical jo kind Ly, 
hindering : helpful 
antagonistic : co-operative 
disorganized : systematic 
aimless : purposeful 
evading : responsible 
inconsistent : consistent 
irrational : rational 
irresolute : authoritative 
uninspiring : challenging 
lethargic : vigorous 
dependent : self-reliant 
retaringyitonceful 
aloof : responsive 
pessimistic : cheerful 
humourless : sense of humour 
expressionless: expressive 
apathetic. alert 
dull : stimulating 
rigid : adaptable 
inflexible : open-minded 
unimaginative : resourceful 
stereotyped : original 
fuzzy : precise comm. 
unimpressive : personal magnetism 
easily upset : self-possessed 
distracting : pleasant voice 
unintelligible: fluent 
B= Bolton P = Peruniak 
Figure 4 


and Source of Bi-polar Adjectives 


SOURCES 


Left Term 


vuvhiw Ax B@O@Q Qye@QQ vuoyovi jojo @w) [wo] po “o(@) 


IAQ view Qu re @) w] w vy @) gvuUUY | vo] &» (Zico) |] oo @) C2) 


PeRi gic ern 


SH 


content validity, a second section required raters to rank-order the 
important attributes displayed by each interviewee; results on this 
section could then be compared with the rank-ordered mean scores of the 
30 items when grouped into attributes. (2) The final section of the 
instrument, titled Overall Impression, forced observers to make a global 
evaluative judgment in the form of a single score on a seven-point scale 
ranging from unsuitable to outstanding. 


These then were the principal characteristics of the assessment 
instrument developed at the Faculty of Education, Queen's University, 
for interviewing purposes during 1971. 


CHAPTER V 


INTERVIEW PROCEDURES 


The interviewing of academically qualified applicants for the 
new academic year was scheduled to begin in March, 1971. At that time, 
the faculty numbered 40, and the enrolment stood at 331. The projected 
enrolment figure for September, 1971, had been set at 650. Because 
interviews are conducted during the academic year preceding the year of 
actual attendance, it will be seen that, as the student enrolment was 
being doubled, the faculty complement available for interview purposes 
remained constant; new staff members would not be joining the Faculty 
until after the completion of the admissions phase. 


Estimating that about 1,000 interviews would need to be conducted 
in order to recruit 650 candidates, the Faculty decided to stretch and 
augment staff resources in the following ways: (1) interviews would be 
limited to 30 minutes; (2) interviewing panels would consist of no 
more than three members; (3) outside personnel would be invited to 
assist faculty with the interviewing. Thus, a typical interviewing 
panel consisted of one faculty member (the chairman), one candidate- 
jn-training, and one practicing teacher or administrator from the 


associate schools. 


Training of observers could only be described as minimal. Less 
than half of the student and faculty interviewers attended a practice 
session, which featured the presentation of some interviewing guidelines, 
the rating of an applicant in a videotaped interview, and follow-up 
discussion. None of the assisting school personnel could be included 
in this orientation exercise, nor was any training provided to the 800 
associate teachers who rated candidates during the subsequent practicum 
phase. 


However, all interviewing panels were provided with a nine-page set 
of printed instructions, which was to be studied prior to the first inter- 
view. (Appendix 3). These guidelines were not distributed to the associate 


38 


39 


teachers in the follow-up stage (the practicum); they received only 
one page of general information about the project (Appendix 4). 


Understandably, many of the individuals connected with this project 
entertained serious reservations about the usefulness of the enterprise. 
Some of the most fundamental principles governing indirect measurement 
appeared to be violated: 


(i) observers, by and large, were untrained; 

(ii) the procedures for observation, in the main, were 
nonsystematized; 

(iii) much of the assessment was based on inferred rather 
than observed categories; 

(iv) observational recording did not proceed concurrently 
with the events but was based on recall of past behaviour 
after the interview was over. 


Therefore, there were grounds for fearing that raters’ assessments 
might tend to be at considerable variance with one another, that they 
might reflect biases stemming from lack of criterion specificity, and 
that they might be contaminated by the unpredictable, non-directed 
nature of the interviewing transaction. 


Nevertheless, once the policy decision had been made, reservations 
about the lack of that degree of methodological rigour demanded by 
fundamental research were not allowed to impede the arrangements. It 
was hoped that in the-different world of professional practice even a 
modest degree of systematization might yield some useful dividends. 


The main interviewing thrust was planned for March 18 and 19. 
All classes were cancelled for those two days in order to release 
faculty and students for that purpose. Interviewing took place at five 
centres: Kingston, Ottawa, Montreal, Toronto, and Kitchener. No 
difficulty was experienced in securing a slate of 40 enthusiastic 
students to become panel members. Likewise, the directs of education for 
the surrounding counties proved highly cooperative in releasing a total 
of 40 administrators and teachers to assist with the interviewing. 


40 


Approximately 465 interviews were conducted in that first round, 
leaving several hundred still to be held in the coming weeks and months. 
In subsequent interviews it was rarely possible to involve school 
personnel. Students also became unavailable after the term ended in 
early May. The registrar, who was responsible for scheduling interview 
times and identifying faculty interviewers, found the staff increasingly 
scarce. As the time-lag caused by the interview phase threatened to 
jeopardize the attainment of the Faculty's target figure, an administrative 
decision was made to grant admissions to the backlog of applicants in 
order to meet the quota and to postpone their personal interviews until 
the fall registration. The monthly distribution of interviewing was 
as follows: March, 465; April, 138; May, 95; June, 34; July, 2; 

August, 3; September, 99; October, 4. Over 840 applicants were 
interviewed to generate a class of 640. 


Both the student members and the school members of the panels 
generally found the interviewing very stimulating and satisfying. 
Practicing teachers and administrators were also deeply impressed by the 
excellent calibre of young men and women seeking admission to the 
profession. Later, several directors of education endorsed with 
enthusiasm this model of faculty-school collaboration and urged that it 
be continued and expanded. 


Faculty members, on the other hand, tended to be more polarized 
about the issue. Some genuinely questioned the predictive value of such 
interviews; a number doubted their own personal competence for this 
serious responsibility; a few resented the heavy demands of time. It 
was not altogether surprising, therefore, that the Faculty Board 
eventually eliminated the universal interview as an element in the 
admissions procedure for the following year. 


If a hypothetical commentary may be permitted at this point, it 
is the conjecture that perhaps very few faculty would remain committed 
to heavy annual involvement in interview procedures -- unless selection 
was directly related to the composition of their own classes. Screening 
applicants for the good of the professton or for the ultimate benefit 


4] 


of school children or even for the faculty of education to seem like 
an altruistic objective much too abstract and personally remote 


for most mortals. 


There were two periods of practicum experience: November 1-12, 1971, 
and February 7-March 3, 1972. Associate teachers were requested by 
means of a one-page memorandum to complete an assessment instrument for 
each candidate whose practice teaching they observed. The level of 
returns in this follow-up phase would have to be characterized as good 
on the part of associates and fair-to-poor on the part of faculty. 


Initially, there was some concern on the part of associates. However, 
the faculty in their visits to the schools were soon able to allay much of 
the anxiety and to correct most misperceptions. Nevertheless, a small 
number of associates sent in letters indicating that they were unable to 
participate in the study. Their grounds for refusing included violation 
of professional ethics, vagueness of the criteria, and their inability 
to assess personal qualities. A smaller number wrote letters of support 
commending the Faculty for undertaking a vital piece of research and 
for developing sophisticated instrumentation. 


Another group with very strona feelings about the project were the 
actual subjects of the treatments -- the candidates. They communicated in 
no uncertain manner their sense of betrayal at not being informed about 
the personal assessment instrument prior to the first practicum. The 
explanation that it was judged unwise to burden them with additional 
anxiety for their first round of practice teaching and that, consequently, 
the announcement was planned for their return from the schools, received 
SnOntashritcs 


The negative reactions of some of the associates and of many of the 
candidates illustrate again the need to maintain desirable personal 
contact without jeopardizing the representativeness of the situation 
for research purposes. A recognition of the aversive side effects of 
certain research procedures, however, is only a first step towards a 
solution -- not the solution itself. The remaining part of the task poses 
some critical dilemmas; for instance, how can real freedom of choice be 
extended to the subjects and still ensure a representative sampling of the 
population? 


42 


On December 3, 1971, during the Professional Issues period, which 
the entire student body is expected to attend, a detailed explanation 
of the interview project was presented to the candidates. The assess- 
ment form was then distributed through the audience, and the students 
were requested to do a self-report. 


After the venting of some initial hostility and skepticism, the 
overwhelming majority of students complied with the request; 344 re- 
turns came in, of which 14 were spoiled and 330 were usable. Since 
the absentee rate that morning approached 50%, it is obvious that the 
return rate from those in attendance was surprisingly high. Although 
a small percentage continued to find the whole thing repugnant, apparently 
most of the candidates were sufficiently persuaded of the institutional 
value of this study that they exposed themselves to the embarrassment 
of self-evaluation on highly personal dimensions. 


Ass 5s ee 


} 
pos? ete on | | 
Fe on pa aahbajen, (a - a 


a a * &e” 
ive Bh geliky 


af 7 7% 
6 wighhinag nehisenet an bw i. gel WP ned Ti? gn > “ 
) ‘7s ites 9 ipadah bY hiarse: al WHAM | t bod! ; ha . 
-Heg)8h Ott rh ab Cerys vt ee ew Y. LO, asic 
relute ot tue .ostersus Sie: 9 bat TAL ; mens pin ‘om 
— gdagari-7 ta e ar Ct haz 240094 
ay . : 2 : athe a | 
je > Hi Mott wh 42 Tepe eat A cee cae, 97,45 at seers 
‘ as : ‘ P ‘ , + ae | 
hepa aide iw Poh inshut ts Wo. waebratem ygl tebadetes 
pig “f be, by : mi i'd » Gla ar ‘ah poe en 
au ah) ‘| m f wri i lah: ope aatiagte 
oy 98 i , at. hie yer 
7 ini} erpeher ery? (ni ody OTT ote" 
) a ie “ name 0G) ales 
j | j , * @ i i rt H> ay hue 
> , a «car % T4 a ey”) ( i ye ni } ‘" Wisia . 5 bits + 
wleugctt 
cf 
eV 
; ¥ - 
woe? 
ry : 
/ Dd te oud 
i <p ¢by 
7 ¥ 
id 
* 7 ‘ S ny 
i ar 
> Ry : : = 6 ea 7 
a ~ . an 2 n 
fi 7 ws ae var’ 
: = UE : a , 7 5 - 7 - . a H% - 7 
C ws 7 og : : - Ais : 7; , vies 
7 = : ’ ay Oe oY: 
; 
e ( 7 


CHAPTER VI 


ANALYSTS OF THE DATA 


The main purpose of this report is to describe the long-term 
project that initially was concerned with the development of an interview- 
rating instrument. Some six years later the focus was upon a comparison 
of interview results with ratings of teacher effectiveness. An initial 
overview of the development stages and analysis of the associated 
data is presented to, in part, demonstrate the basis for claims made 
about the interview results and to provide documentation for any in-depth 
follow-up that may result. 


The data presented are based on returns from the following groups 
of raters and assessment contexts: 


Pi teeed es RY Assessment Context 
lee Tr acu'ycy. 933. Pre-Admission Interviews; Student Teaching 
2. Students 467 Pre-Admission Interviews 
3. Teachers 346 Pre-Admission Interviews 
4, Administrators 254 Pre-Admission Interviews 
5. Associates 1 660 Student Teaching (November) 
6. Associates 2 778 Student Teaching (February) 


ee 


It may be noted that assessments by groups 2, 3, and 4 were derived 
entirely from the pre-admission interviews. By contrast, assessments 
by groups 5 and 6 were derived from the supervision of students during 
practice teaching over periods of 2-4 weeks; we may properly regard 
these returns as post-admission data. The 933 returns from group | 
represent a mix of pre- and post-admission data; in certain respects 

it might have been advantageous to have kept the two Faculty categories 


separate. 


43 


44 


Data from two other groups will be included incidentally in this 
report, namely: 7 - Student Self-Ratings (325 returns) and 8 - Composite 
(600 returns), representing a bolstered school educators’ group attained 


by combining returns from 3 and 4. 


An heuristic approach to the issue of inter-rater agreement, 
rather than a statistical one, will be presented. Evidence of stability 
and consistency of ratings will be sought in each of the three main 
sections of the instrument: (1) in the first section, "Personal 
Dimensions", the mean scores and standard deviations for the 30 items 
will be examined; (2) in the second section, "Order of Important 
Attributes", rank-order distributions will be compared; and (3) in the 
final section "Overall Impression", the global ratings will be analyzed. 


Table 1 presents the mean scores assigned to each of the 30 items 
by each of the six rating groups. Examination of the highest and 
lowest means shows very little spread for any of the group; for example, 
the administrators scored a low of 4.62 (on item 15) and a high of 
5.36 (on items 24 and 28), representing a difference of only .74. The 
greatest difference was recorded by Associate 1, with a low mean of 5.54 
assigned to item 30 and a high mean of 6.07 assigned to item 24, yielding 
a difference of 1.53, which is still quite minimal considering that it 
covers a rating scale of seven intervals and a range of 30 disparate 
items. 


Table 2 presents the standard deviations for the scores assigned 
to each of the 30 items by the six rating groups. Once again the 
general impression is that of relatively little variation. 


The section titled, "Order of Importance Attributes", represented 
the second rating task for interviewers, following the first task 
which was scoring the 30 items under "Personal Dimensions". On the basis 
of their observation of the student, raters were asked to rank-order 
the six specified attributes; namely, empathy, organization, leadership, 
buoyancy, originality, and professional impression. Raters assigned 
1 to the most prominent attribute and 6 to the least evident. 


It will be recalled that the purpose of this section was to pro- 
vide a check on internal consistency. Would the rank-ordering of the six 


45 


Table 1 


Mean Scores on 30 Items, Assigned 


by Six Groups 


Item Faculty Students Teachers Admin. Assoc. 1 Assoc. 2 

(N=933) (N=467 ) (N=346) (N=254) (N=660) (N=778) 

] oetole! 2 0h) eyes eee) 2 eRe | 6 ou 

2 5.20 4.96 5.428 4.92 4.69 4.80 

3 5.38 5.25 Sides 4.94 5 ALU opel 
4 5.29 5205 S316 4.91 4.69 4.77 
is 5.09 4.87 4.8] 4.69 4.58 4.64 

6 Soe Set 53! 5,05 Bae Say 
q) eS ayo) 5.34 5.09 533 Nees 
8 5.68 5 46 5.56 eas 52 yae9 
9 ayash 5762 5.60 Spey ¥/ 5 oT Seog 
10 Seis 5.60 5158 Daal sails 5f65 
11 5.09 4.84 5.05 4,74 4.78 4.85 
12 5 45 beee 5.39 bee 5.61 515 | 
13 ohgey; 5.34 5.35 5.08 5.34 a6 
14 5.46 ony! 5.40 Sheth s 5.53 5.46 
15 4.95 AY 4.92 4.62 4.62 4.74 
16 5.54 5.59 5455 5237 5.66 Seo | 
17 5.16 5.09 5.14 4.80 Sean bes | 
18 seecy ao 5336 yet Spy ae: 5333 
19 aaae 4.96 5.20 4.86 4,89 4.95 
20 Deol 5.09 5242 5.05 5.38 5e32Z 
21 5765 5.49 5163 ee 5 5.84 5070 
22 B25 5. OF 5022 4.97 Se he 5.20 
23 5226 5.0/7 5.28 5.00 4.84 4.93 
24 bal 5 5.60 bee | 5.30 6.07 bEaS 
25 567, 5233 5.40 Seal Sey oF 35 
26 4.96 A443 4.92 4.70 jag ie 4.85 
27 5e59 5.47 5565 5ye0 5.63 See fl 
28 5.60 bts, ba0o eyes 5.44 Sac 
29 5.42 5.39 B55 eek?) 5.20 5.34 
30 4.96 4.68 4.94 4.69 4,54 4.65 
High Mean 5.78 Stay Deol 0 6.07 5.88 
Low Mean 4.95 4.68 4.81 4.62 54 4.64 
Mean OTs Ne. 30 Be 535 5.05 5.24 5.23 


Means 


46 


Table 2 


Standard Deviations on 30 Items Scored 
by Six Groups 


Item Faculty Students Teachers Admin. Assoc. | Assoc. 2 
(N=933) (N=467) (N=346) (N=254) (N=660) (N=778) 
] 1-14 The a5 1.206 iWeaays 1820 
2 esos: 1.34 LEZ6 leas 1.24 1827 
3 jeg is Ned be19 Lok 1.19 1326 
4 1.30 T.34 1.50 Tez 1.23 1a 
5 1223 Ts23 VE31 1.19 Vrs Tals 
6 die 1.34 fais 1.18 ‘lS 1.20 
hh aS) Pehl isl ln k2 Lal4 1815 
8 eed Aes 1.19 1223 Tele 1.19 
9 ie20 pays 1.19 1223 See Wwea4 
10 at i eth 1.18 Leu lal’ 1.26 
1] 1.42 1.48 1438 1255 1.43 DS33 
12 1203 [ale Ter 1.09 1.04 1.09 
3 1.4 ees 138 Ve2g 1.26 1823 
14 Lens 1220 1213 1.20 hs Tear 4 
15 1335 1.38 t33 1:32 1236 1828 
16 1.09 105 aes! 1.10 1.04 Les 
17 Te2n V281 12] V3) 26 17 
18 ha24 1225 1.29 14 eos Vets 
19 ee. 134 Hazg Leo ].34 1280 
20 1.05 0.99 1.13 1.09 TO T1707 
21 0.97 1.09 0.94 0.96 0.92 1.05 
22 [282 1.40 1.34 13a 1.34 183] 
23 1232 Te38 TH3Z 1.82 V32 1422 
24 0.96 T.02 0.94 1.04 0.96 1.04 
20 [peacte 1.29 To 7, lee 1.40 Wee 
26 ae P82 Aad pS! ail thaya) Lets 
27 0.97 0.94 Ee O02 1.04 0.98 1.00 
28 pale Pare Te 16 1.26 lowe E20 
29 eee Tiesh4 lay al Vets 1.14 1.09 
30 1.39 ttayd 1.44 ck 1ee3e 1253 


47 


attributes in section two be congruent with the rank-ordering derived 
from the scores on the 30 items of section one? For the purpose of 
rank-ordering the data in section one, the mean scores of the five 
component items comprising each attribute cluster were summed. The 
following table identifies the component items within each attribute: 


Empathy Organi - Leader- Buoy- Origi- Professional 
zation ship ancy nality Impression 
i 3 5 ] a 4 
12 8 {Mes 9 6 1] 
Items 16 10 19 Uy: 14 iS 
21 20 Fade 23 22 18 
24 ra 30 28 26 29 


A summary comparison of the rank-ordering results from the two 
sections appears in Table 3. The two patterns are far from congruent. 
Although raters consistently place empathy and organization in the 
first two ranks in both sections, the ordering of the remaining four 
attributes shows no relationship between sections. These disappointing 
results suggest to the investigator that (1) the attributes are low in 
content validity, and/or (2) the rank-ordering task section two is far 
too difficult, given the high number of variables (30 items) to be 
mentally processed. 


If Table 3 is discouraging regarding comparison between sections, 
it yet further corroborates the existence of stability among raters 
within the two sections viewed independently. We have already noted 
this phenomenon in the earlier discussion on mean scores assigned to 
the 30 items. It seems significant that a similar stability characterizes 
the second section, where raters were assessing more comprehensive or 
global dimensions; for example, each of the six groups without exception 
rank-ordered Originality as 6. 


Table 4 indicates the high degree of consistency between mean scores 
on the overall impression and the ratings on the 30 personal dimensions 


48 


93NgLuzzy Ye ULYZLM SWazZ] JUaUOdWOD BALY BY} UO Sau0dS URaW JO sSWNS ay BULMOUS 
*UOLZDAS ,,SUOLSUBWLG LPUOSUdg,, UL SWAT OF WOUZ PeALUaq SUuapug yUeY 


O99 90m Ge” SP Sds pe aie O0RS Ce Pere O20 Ocmeite ar 0292 ya wae uOLSSaudu] |LeUOLSSasoud 
Veslascw te OG5Ge" 4G2 4 oh 02 « “Se Sl de = Ga Set le SZ S "60-9 AZLLeULB Lug 
a0/e0Cme Ge 0S 79 TE" ** 69°SG2. °° "TSE ee ~ Car seSL-9¢ cme) Sours AsueXong 
pete Vem Ya “98 soc= 19°" 1 6SECae Oe LCE G2 Se eGGeyz 9™= §68 Se diysuapeay 
CT VOclCae Ce Gl mle t.do CL AGC mee my BP ALC- le abled be -lo222 UOL}eZ LUebu 
~rOGee Communi. 1 Ga0? Lee  Lee9? Lo icLelée 60m Cate Lake Ayzedwy 
"20SSy [ ‘90Ssy Reni Suayoral S}uapnys Aynoe4 
93NglLuqdy YydeJ 02 paublssy suapug yueY JO SueaW aya BHuLMOUS 
*UOL}99S ,,SeIngiuzzy JUeJUOdWT JO Uapug,, UL SUuapug yUeY 

LEG ce) Cee pig Shoes Ge 1 U0 Ue ae 9OKE Vee ALOre UOLSSaudW] [eUOLSS84$0ud 
aly av ee oe oUee 9 abe Jay Oe mec Ley Omer Peay AzLLeurb Lug 
ac Oey Gary 6 AV Ve Oat a oo WHEN: Ga eO0Gy Ge OL? Aouekong 
5 efiets Ve LBs Peper t: Vie eee Cer al pe Ce Pot . diysuapeay 
2025 ie cb? An oi ide OS G “t9Vee C= -meOoRE AIS SSNS UOLZeZLUebUG 
“61 ob G Loe emule, [xt Phe cee lee COme Lae Lome Auzeduiz 
"o0SSy [| ‘90Ssy Sraitie SUayoral Squapnis Aynoey 


SUOL}D9S OML UL SAgNqLuqzy 40 HuLuapug-yueY JO UOSLUedUO) 


Coo lau 


49 


when reduced to the means of mean scores. 


Table 4 


Comparison of Overall Impression Ratings 
and Personal Dimensions Ratings 


Faculty Students Teachers Adminis- Assoc. Assoc. 


tration ] 2 
Mean Scores 
on Overall a RyAS 5.10 5:27 4.80 4.98 5610 
Impression ) 
Mean of Mean 
Scores on 5.39 5.22 5.35 G05 5.24 SIRS: 


30 Items 


In light of the above comparison, it could be argued that in the 
interest of economy and simplicity the single overall impression 
rating could suffice as the admissions index, rendering a longer assess- 
ment instrument unnecessary. The investigator's view, however, is that 
a more valid overall impression score is produced when observers have 
been required to identify and to rate a number of discrete dimensions, 
as a preliminary exercise in consciousness-raising, before attempting 
a molar estimate. 


Support for construct validity is provided in the following section, 
where the principal factors are seen to be stable across five of the six 
groups. It is to be hoped that, in due course, statistical verification 
may be provided, but it is anticipated that this statistic will merely 
corroborate what is already apparent: the instrument yields consistent 
and similar sets of scores even when used by relatively inexperienced 
and untrained personnel utilizing a global procedure based largely upon 


independent judgments. 


Discussion of validity in this section will focus mainly on content 
validity, with only brief reference to criterion-related validity which 


50 


is the focus in the later set of results obtained six years after the 


students' graduation. 


Content validity provides evidence about the degree to which the 
measures of an instrument represent those distinct qualities, abilities, 
or attitudes which they purport to estimate. Such corroboration is 
particularly vital in the case of inferred mental or personality 
characteristics for which constructs are postulated. If it can be 
established that the assessment scores do, in fact, correspond to the 
phenomenon claimed, then a sound base is provided for subsequently 
demonstrating the instrument's predictive utility with reference to 
certain terminal or output variables. One powerful, but indirect, 
approach to content validation is through factor analysis, where it 
can be demonstrated that items cluster in patterns similar to those 


defined by previous researchers, such as Ryans and Bolton in this case. 


As the first step, collected data were translated into a 30x 130 
item inter-correlation matrix for each group of raters. It seems 
significant that in none of the eight matrices does a single negative 
correlation appear; in short, most of the items appear to be positively 
correlated. One pair of items which consistently had low inter- 
correlations, such that they may be considered essentially uncorrelated, 
were irresolute-authoritative and autocractic democratic. The single 
lowest correlation emerged in the teachers' matrix, where retiring- 
forceful correlated with critical-kindly (r = .005). 


Fach inter-correlation matrix was then analyzed using the principal 
component method, a data-reduction procedure which attempts to explain 
the variability by extracting variance-saturated components or factors 
of descending magnitude until the total common variance (30 in this 
case) has been accounted for. From the accompanying table of eigen- 
values (Table 5), it may be seen that the first factor in every case 
accounts for a high proportion of the total variance, averaging out 
to roughly 50% across the six groups. The fourth, fifth, and sixth 
factors given some indication of the rapid loss of explanatory power 
in the remaining 24 factors which are not shown here. 


5] 


Table 5 


Eigenvalues for Six Factors by Rating Groups 


Factors Faculty Students Teachers Admini s- Assoc. Assoc. 
tration ] 2 
] xs 19:7 14.109 14.224 in LOU 133032 14.571 
2 2.206 2252s, 2 aol e143 2.030 2.370 
3 eA) 15835 ileys les LOY 1.796 He O15 
4 0.799 etZe 1.098 0.893 Teon 1.024 
o 0.726 0.779 04875 0.738 0.940 0.842 
6 0574 0.746 On AU 0.618 0.887 0.792 
20.93 AN A\2 22400 23.34 20.02 o\G2ee 


The next step was to transform the principal components solution 
into a more preferred solution by rotating the reference axes in 
accordance with the Varimax Criterion technique. The general purpose of 
this procedure is to clarify the salient features of the factor so that 
the conceptual identity of the group of high-loading items may be 
ascertained and understood. 


Tables 6 and 7 depict a six-factor Varimax solution for faculty 
and administrators, with item loadings grouped by the specified 
attributes. Several features deserve comment. (1) The factor structure 
is dissimilar in the two solutions, with the faculty pattern representa- 
tive of all the other rater groups and the administrators' pattern 
providing the single maverick. (2) If one considers the items classified 
within each given attribute, many, but not all, load high on the same 
factor. Apparently, there is a measure of conceptual commonality between 
certain attributes and certain factors. 


Varimax Factor Loadings for 30 Items 
Grouped by Attributes (Faculty: N = 933) 


Attri- Item 

butes No. 
I 
i nies 
12 ae i 
Empathy 16 af 8 
21 BLCL 
24 .243 
3 eros 
Organi- 8 ~415 
zation 10 . 348 
20 neo 
27 . 164 
5 goyile, 
Leader- ibs .704 
ship 19 .792 
25 564 
30 .800 
] 581 
9 .605 
Buoyancy 1/7 594 
23 .724 
28 .607 
2 Thee 
Origi- 6 299 
lit 14 eee 
foe A 92 wes BPG 
26 .488 
4 347 
Profes- 11 Arie 
Sional 13 57 
Expression 18 Pay 
29 . 391 

Eigen- 

values 7.438 


Relative Pro- 
portion of the 32 
Common Variance 


igh 


.574 
LOU 
seeks 
sOU2 
eval 


204 
. 303 
5969 
Fog 
414 


.004 
ACA 
. 269 
. 206 
.054 


yA 
644 
“055 
.265 
noo 


YAS) 
462 
ep hl 
B48, 
selley! 


. 160 
sou 
. 200 
eey As 
meu 


ewe 


52 


Table 6 


Fact 
ital 


.124 
‘SIS 
ats 
1330 
.269 


. 188 
.659 
. 548 
.965 
. 580 


.974 
SOrL, 
.246 
437 
.282 


.264 
.120 
s023 
Aas) 
345 


. 346 
Be 
. 184 
404 
OG 


ar 
24] 
. 384 
Sue 
460 


4.709 


ors 


-0.043 
-0.008 


-0.027 
.014 
. 108 
sao 
. 180 


.018 
nue 
.058 
~407 
. 380 


. 148 
LOT 
. 156 
.454 
gerade 


-0.091 
-0.120 
-0.003 
.036 
wry 


-0.079 
.006 
mAs 
.243 
Sis) 


-0.082 
-0.040 
Shs) 
. 106 
.045 


Eels 


100 


Common- 
alities 


sick 
.683 
nel hoa! 
ahs 
a7 10 


Ae 
. 746 
10) 
ress 
754 


.664 
oi 
r7 90 
ase 
190 


a f/2 
.820 
alee 
AVES) 
.726 


.839 
2790 
.806 
. 766 
.762 


.854 
-OI9 
.651] 
.809 
.816 


22.932 


oe 


Tables 8, 9, 10, 11, 12, and 13 show the items loading above .65, 
rank-ordered for each of the six factors. It will be seen that not a 
single high-loading item appears under Factor VI for any of the six 
rating groups. Similarly, few high-loading items fall out in Factors 
IV and V, the largest clusters containing only two items. In addition, 
the high-loading items in Factors IV and V display little commonality 
between rater groups; we seem to be dealing with elements which are 
disparate, residual, and group-specific rather than consistent and 
generic in nature. 


Because of the foregoing considerations, the investigator deter- 
mined to focus attention on the first three factors from the six-factor 
varimax solution. The first three factors appear to provide a more 
solid basis for a discussion of item validation (and possibly eventual 
jtem reduction). Further analyses using three principal components 
followed by a varimax rotation would be a desirable extension for 
exploration of the underlying structure. However, that activity was 
not undertaken for purposes of this presentation and must be relegated 
to a follow-up study. 


Table 14 presents frequency distribution of the high-loading items 
for five of the six groups of raters, the administrators being excluded. 
On examining the items loading highly on Factor III, one observes 
dimensions such as (3) disorganized-systematic, (8) aimless-purposeful, 
(30) inconsistent-consistent, (4) fuzzy-precise communication, (27) 
irrational-rational. It is apparent that Factor III represents the 
concept of organization. (How the two items, (6) rigid-adaptable 
and (7) autocratic-democratic, came to be included in Factor III by 
the students' group is not clear.) 


An analysis of Factor II's high-load items reveals dimensions 
such as (16) critical-kindly, (12) suspicious-trusting, (24) antagonistic- 
cooperative, (7) autocratic-democratic, (14) inflexible-open-minded, 
(6) rigid-adaptable, and (21) hindering-helpful. It seems clear that 
Factor II represents the concept of empathy. 


The salient items in Factor I include (2) dull-stimulating, 
(11) unimpressive-personal magnetism, (19) lethargic-vigorous, 


Attri- 
butes 


Empathy 


Organi- 
zation 


Leader- 
ship 


Buoyancy 


Origi- 
nality 


Profes- 
sional 


Expression 29 


Eigen- 
values 


Relative Pro- 
portion of the 


Common Variance 


54 


Table 7 


Varimax Factor Loadings for 30 Items 


Grouped by Attributes (Admin.: 


29 


I] 


.009 
OG 
Ae) 
300 
aL OL 


.710 
.670 
2570 
.500 
Pood 


.807 
42) 
560 
.601 
A0e 


se 10)2) 
~230 
yo, 
ES 
.457 


~429 
149 
BUsg 
~425 
. 349 


547 
ooe 
.628 
seo) 
woe 


Sele 


Factors 
Bi IV 
-o720 .080 
264 .049 
BO, mney 
nl Ly, a125 
.178 .240 
PHT Loe 
ROLO Sst 
e207 226 
LO 293 
. 140 =30i 
B25) mAs 
531 SW 2 
~545 mos 
2302 sil Tesi 
. 388 239 
.630 .196 
656 arate 
mac eal 
594 ~452 
.500 . 340 
.696 .158 
oye . 186 
-302 082 
. 369 . 146 
. 398 153 
320 ~549 
B03 085 
aed a Hla 0 
.525 .438 
254 .801 

5.379 2.265 
78 87 


N = 254) 


Common- 
alities 


7091 
AAS 
ag! dois) 
pS 
By As\) 


Byes 
ag hile) 
. 796 
cio 
742 


747 
.828 
. 166 
.774 
.803 


.748 
. 700 
.814 
TOLL 
761 


82] 
.806 
AASV 
.824 
n/9e 


.820 
.834 
.689 
AE 
Boia 


23330 


on 


Table 8 


Varimax Factor Loadings for 30 Items (Faculty: N = 933), 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


Item No. I I 7 IV V VI oe 
ore = a —— Se = — alities 
30h) 4800 -050ne ©. 280RN. 052 -130 216 790 
19 T92e. <2608? ©. 236hae 050 105 146 790 
1m grold Basbeee. 24 bee-238 211. -0.040 819 
ba) ayeket | 2heke .225REa 220 281 036 bis 
yl agneek 2708s 346A 288 ‘ore -0a079 839 
5th elder 2200, 367neR 295 174 097 “809 
16 MiToe- V78a8! —.11 8808. .255 146 108 B37 
12) ees .7s0n) 1958 184 soar. 094 683 
l4& peecee? Tage » 2699. 152 lee 180 ti 
7AM Mee rs: MEN SEV? Ue a+ aera 139 237 Fh) 
cso eee’ yon Daa ab 127 018 oe 
Tigi? ail) 30° quale dV)? B18 hers, (ez 854 
B9 Guise) acocm  .6bomeA .17oame 0.073 129 746 
jam Ueki ban. See See 050 ae aes 791 
Bn wo00-, .462¢. -225eme’ 652) 088 006 790 
8 427 370 117 118 672 106 809 
1 581 520 Gia = 187 084° = £0709) 722 
5 545 004 5ydgee 115 077 138 664 
9 605 644 120.089 Geant —ObAeO 820 
10 348 569 GACMNE 11 d¢an 60.008 058 761 
13 557 200 384. =. 156 172 she 651 
14 224 571 18dane 592 120 176 806 
17 594 555 023m. 166 182  -0.003 722 
20 237 39] SG5aan 129 110 407 723 
22 526 263 404 430 112 243 766 
25 564 206 Acta 130 002 454 767 
26 488 187 316M. 516 153 316 762 
27 164 414 5G0hme. . 163 222 380 754 
28 607 390 345¢Rer. 114 211 171 726 
29 39] 230 A60mpe .3 12 620 045 816 


EIEN AS laa 5.7975 4,709'' 924.420 1.539 1.105! 22.932 
values 

Cumulative 

Proportion 32 57 78 88 95 100 

of Common 

Variance 


56 


Table 9 


Varimax Factor Loadings for 30 Items (Students: N = 467), 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


deen, 1S” ieee vee GAR 
19 a Gls oso .119 . 198 .292 . 104 -748 
2 ffs) . 146 2320 243 142 . 143 .799 
20 hci 2135 HTK S . 149 aise . 200 .749 
i reall . 344 miley AO} 2 .096 . 143 .696 
i a6 fails (sey ad cs . 83 ese Aas 
9 Aes 43] .154 .070 -0.102 e107 B/ 45 
28 .654 249 .149 . 206 .238 234 .666 
16 . 264 754 .310 .044 O71 .068 . 746 
24 yall! oye aye vibes Ney | ei be he] .683 
iz 24] ar Atys} 254 oL38 .095 .134 .669 
7 AVA . 349 744 .065 -0.003 .070 ./14 
14 yal e395 iS. .098 .101 Al aa’ . 760 
6 Teo /, 34] ais y .OT5 .050 aya) . 736 
5) 126 061 077 764 184 345 761 
8 384 aor lis .683 167 006 711 
13 252 248 108 . 142 aC ale 167 699 
UG 324 071 234 246 .681 077 695 
] 622 325 322 . 266 104 011 678 
4 301 026 Isis) .474 ay 597 752 
5 406 -0.029 -0.061 .496 453 097 630 
10 212 369 249 ed 180 103 59] 
ies 637 091 359 .296 326 138 756 
18 SH, 45] 116 .073 Wz yf) 663 
20 035 306 220 .408 456 276 594 
21 Dee 566 352 Ba fe 193 203 642 
ec 401] 107 sys .248 33 105 675 
26 422 046 ae ml cal 350 153 629 
| 004 362 043 aoe 446 404 653 
29 287 170 . 141 . 200 146 749 153 
30 649 -0.063 -6.003 19] 543 132 THES: 
Eigen- 
Satie 6433; “3. 65/a 73. 43Gren ce cld 2.740 20039; 21s] 
Cumulative 
Proportion 30 48 64 77 90 100 
of Common 


Variance 


at 


Table 10 


Varimax Factor Loadings for 30 Items (Teachers: N = 346), 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


Item No. 1 II II 1V V VI ese 
Trt. a = a == = — alities 
19 .837 —.086 094 158 158 208 810 
30 .828 -0.057 299 027 118 067 797 
2 750 .244 268 223 136 047 764 
23 687 313 159 541 007 028 799 
11 667 355 246 308 234 146 802 
15 667, 278 367 201 11 167 737 
5 655 -0.059 463 053 24) 050 710 
7 R128 83] 140 .076 089 029 740 
6 22 | 788 134 023 093 004 697 
14 096 ver 229 093 015 079 697 
16 -0.064 669 136 058 153 506 753 
12 193 651 116 260 344 109 672 
3 meer 146 808 ~—s««. 14 028 005 770 
27 205 263 12a 2258 003 161 698 
20 288 165 Bivee .. 1372 034 378 737 
8 402 267 658 .053 035 104 681 
18 219 149 142 804 165 204 810 
29 354 148 407 Fil 052 -0.085 836 
1 569 443 235 201 267. ~+=~-0.090 692 
4 405 200 609 448 120. eRe 797 
9 429 416 079 195 629 111 810 
10 258 356 593 .076 190 398 746 
13 534 191 494 140 337, +=-0,162 726 
17 530 306 087 253 537 178 767 
2] 242 447 345 .140 079 593 755 
22 527 45] 460 s097,." -ORtSI 7 739 
24 229 616 255 eo2eu -Oegd4 361 709 
25 646 125 468 1080. an0R0io | “-0206/ 667 
26 601 438 295 ‘120 -06290 052 743 
28 635 25] 376 305 110 082 719 
Eigen- FLORIS 120) ma bay e853 1.424 1.363 22.378 
values 
Cumulative 
Proportion p2 55 1a 88 94 100 


of Common 
Variance 


58 


Table 11 


Varimax Factor Loadings for 30 Items (Administrators: N = 254), 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


Item No. 


24 
16 
14 
2 

6 

7 
2] 


I 


-0. 
-0. 


-0. 


VI 


Common- 


alities 


fais 
./55 
ae 


Eigen- 
values 


Cumulative 
Proportion 
of Common 
Variance 


2090 
fails) 


Gelso 


29 


=O. 
-0. 


ass, 


Table 12 


Varimax Factor Loadings for 30 Items (Assoc. 1: N = 660), 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


Item No. 1 II II IV V VI Common - 
- a a rr! — ee alities 
1] es) .228 AAS “iva el37, . 100 . 760 
79 mT) S) 230 see .040 .019 .058 et PSE: 
Z ka: 167 186 £052 218 253 Tif 
23 144 ZV . 198 .040 (fame 167 718 
30 709 -0.021 .249 Pel Be 229 034 ry) 
15 673 148 299 . 168 214 310 735 
16 144 LIZ 15] 2202077 083 166 591] 
24 138 fA: 348 -059 052 -0.097 662 
3 189 087 744 .050 025 111 612 
i 189 191 189 [Ome 107 076 794 
18 397 256 -0.019 sa keys 654 -0.012 668 
29 180 166 3h F052 715 126 730 
] .487 577 122 mOZ 015 az 608 
4 274 040 550 ~185 439 146 627 
5 446 -0.063 407 .480 233 -0.072 650 
6 s193 494 070 A562 205 427 642 
j .050 617. -0.149 062 283 37] 627 
8 383 2\2 645 Wl 054 216 672 
9 .611 527 026 . 164 -0.038 005 680 
10 ci 526 459 wide 023 057 619 
2 . 200 600 184 .045 066 005 440 
14 N03 625 171 347 176 302 672 
17 .604 500 -0.062 siesiel 095 061 648 
20 .097 282 643 maiets: 143 024 558 
21 225 599 430 .082 052 034 605 
Be 416 236 459 .058 041 a) $52 
25 419 158 395 .489 -0.048 225 648 
26 503 166 299 . 100 057 630 780 
“ali 098 448 582 ley 204 040 617 
28 589 362 421 . 138 037 021 675 
Eigen- 
values 6901 Ae 4 795 38979 eels 12683 ROS 7a ec 02e 
Cumulative 
Proportion 31 55 75 84 o2ee 100 
of Common 


Variance 


60 


Table 13 


Varimax Factor Loadings for 30 Items (Assoc. 2: N = 718). 
with High-Loading Items (Higher than .65) Rank-Ordered by Clusters 


itemgNoy ly IL y Il yy iv v Vga eae 
2 » 155: 249 246 .003 Lee aes .789 
30 744 -0.021 Real Rey ae. ies .040 oO 
1] 743 Loe 196 . 106 S232 . 109 bolder: 
ge 7 ATgeh . 00 Rie Omen 87 174 129 B7P2 
19 soos a ie eye) meen -0.002 .089 mil alle 
[Ps .660 a2 0) . 346 .040 Bali . 342 e755 
16 .198 Pre .195 -0.072 .104 -0.009 057 
Ke VAG ay iat aA) Pele O78 075 .630 
I O51 eval PON Ze 02052 Pal ig, 293 rOGy 
24 14S .688 429 BEG 0G .008 P/O2 
7 NG) = ODay 194 teee 257 .073 365 788 
3 1285 104 800, EOU, 105 121 doe 
20 S202 294 ays: 216i .086 036 705 
8 45165) 163 .695 .074 sie 144 693 
URS 187 345 216 .678 232 109 27, 
18 309 334 103 a 718 -0.063 750 
| 517 534 e300. 067 027 035 650 
4 293 095 639 .039 392 259 125 
5 513 -0.118 -392 a5 174 -0.021 634 
6 163 57/5 ard by . 180 176 428 651 
9 513 612 20 Bibel) 069 -0.113 718 
10 314 399 .626 .209 -0.098 108 Pld 
hig, 483 559 054 065 259 011 620 
2] 203 550 582 Pe jt ts) 053 082 705 
(at 445 2350 463 . 100 -0.036 546 774 
25 385 170 445 .500 020 329 734 
26 484 169 356 .159 087 561 738 
fei 134 301 638 SOO, 186 106 655 
28 adit rat) 5G 255 118 098 689 
29 223 133 408 201 636 210 Tey 
Eigen- 


values SO7oy. bed hles 25.247 geal ose 1.586 1.561) — 21h 


Cumulative 
Proportion 28 
of Common 
Variance 


Frequency Distributions of High Loading Items on 


Principal Factors for Five Rater Groups 


Three 


Factor III 


Factor II 


Factor I 


Z Sa}eLOOSSY 


| SazeLoossy 
SUdYIRd| 
S}zuapnys 
Azypnoe4 


“ON wad] 


2 Sa, eLOOSSYy cee eae ere er 


SUdydeda] ee Partie rein 


"ON Wad] eee te wee at 


| SdzBLIOSSY 


>< fio<’? fid< “Pie<fiies:. Yies 
><) oS SOS OSes 


Sugyoeo] 


syuapnys 


Azypnoe4 


“ON waz] 


62 


(23) expressionless-expressive, (30) retiring-forceful, (15) uninspiring- 
challenging, (17) humourless-sense of humour, (5) irresolute-authoritative, 
(9) pessimistic-cheerful, and (28) apathetic-alert. It seems that 

Factor I represents the concept of dynamic vitality, to which we will 
assign the name surgency, borrowed from Cattell (Ryans, Dali) 


The conclusion to which this argument leads is that the first 
three factors of the Varimax six-factor solution support the content 
validity of the high-loading items in Table 14 and that one may use them 
(and possibly their generic equivalents) with considerable confidence 
that they are measuring what they purport to represent. (No such 
claim can be made for the other items). 


The results of this analysis based on the first three factors 
correspond closely with Ryans' work. His TCS Pattern X (kindly, under- 
standing, democratic vs. aloof, restricted, egocentric teacher (behaviour) 
seems related to the Empathy of Factor II. His TCS Pattern Y (responsible, 
systematic vs. evading, unplanned teacher behaviour) appears to involve 
the same behaviour as the Organization of Factor III. Finally, Ryans' 

TCS Pattern Z (stimulating, imaginative vs. dull, routine teacher be- 
haviour) would seem to represent the Surgency of Factor I. 


It is particularly gratifying when content validation can be es- 
tablished not only by factor analysis of the data but can be corro- 
borated as well by the findings of the original investigation. To the 
extent that the results of the present study replicate those of Ryans', 
it may be argued that it has served a cross-validation function, pro- 
viding mutually-supporting evidence for the claim of validity. 


Ideally, a study such as this ought to address the question of 
predictive validity. Do the interview scores serve to predict. applicants' 
teaching effectiveness? For a number of reasons, that question cannot 
be answered from this data alone. 


CHAPTER VII 
RESULTS AND CONCLUSIONS 


A faculty of education selection process is designed to select 
the "best" set of applicants to admit. Typically, "best" has been 
defined as those with.the highest undergraduate grades, the highest 
test scores, and the best performance on an interview. There are 
many reasons why best has been defined by these measures, but the 
basic argument is that these will predict academic performance in 
faculty of education programs and, unless academic performance is 
adequate, other criteria do not matter. 


Many educators have been troubled by the fact that the predictive 
limits of the admissions process are often faculty of education course 
grades and associate teachers' ratings. Queen's Faculty of Education 
has sought to extend this limit. This raises some problems. Whereas 
everyone will agree that a good student must at least have good grades, 
there is little agreement about the characteristics of a good teacher. 
Nonetheless, we offer a simplistic set of results directed at identifying 
success as a teacher. 


The first criterion for success is "years of teaching since gradua- 
tion". This is a baseline criterion widely accepted in industry and is 
usually called "seniority", This is a weak measure of "coolness", but 
one cannot evaluate other criteria unless the person is employed. The 
second criterion of post-graduate’ performance is the principal's rating 
of the teacher's performance. This is a simple year-by-year rating of 
teacher performance on a scale from poor to excellent. By examining the 
admissions and teaching characteristics of rated teachers, we may begin 
to infer what characteristics make a good teacher, at least from the 
principal's viewpoint. 


Data collected in this study were 255 principals' ratings on teachers 
employed in their schools and employment data on 346 teachers. These 
success measures were related to performance in the admissions interview 
and in practice teaching. Many statistically significant correlations 


63 


64 


were found, but none larger than 0.33. The majority of significant 
correlations, however, were about 0.15 + 0.04. Exact values for the 
correlations are presented in Tables 15 to 19. We first examine the 
admission interview ratings in relation to seniority. 


The teachers with the most teaching experience now were those that 
faculty had perceived as cheerful and cooperative in nature. They appeared 
alert, self-reliant, and verbally fluent in their interviews. Teacher 
ratings of this group were different. It was those who were least self- 
possessed and least responsive to the perceptions of the teachers that 
have obtained the most teaching experience. In contrast, administrators 
saw the currently more senior teachers as the more authoritative, purposeful 
and self-possessed of the interviewed applicants. The student view focussed 
upon those who were least helpful, but most resourceful and adaptable. The 
significant characteristics which predicted seniority seem to vary with the 
position of the rater. Each rater group has different concerns. Quite 
possibly, the characteristics found significant for each rater are those 
of most concern or most characteristic of people in those positions. 


In addition to the admissions interview, the sample teachers were 
rated during their associate teaching experience. Those who were deemed 
most responsible and who made the best overall impression in their practice 
teaching, are those who have obtained the most seniority since graduation. 


The other success criterion we considered was a principal rating of 
the teachers' performance in the current and previous school years. Many 
more variables were significantly related to this more subjective criteria. 
In part, this is because ratings tend to correlate better with other 
ratings than with alternative types of success measures (e.g., rank order- 
ings, percentiles, etc.) Nonetheless, there were a variety of admission 
ratings which weakly predicted the 5 year post-graduation performance 
ratings of teachers made by principals. 


For the faculty admission interview ratings of applicants, four of 
the characteristics which predicted seniority also predicted performance 
ratings. Specifically, those rated as cooperative, self-reliant, alert 
and fluent, were most likely to obtain good performance ratings, as well 


65 
Table 15 


A. RATINGS GIVEN BY FACULTY MEMBERS 
AT THE ADMISSION INTERVIEW 


Characteristic Seniority Performance Rating 
Responsive .03 Us 
Stimulating U5 Aba 
Systematic “U5 aay 
Communicative 205 Ay Wes 
Authoritative .09 .O1 
Adaptable .O1 .08 
Democratic e001 .06 
Purposeful 303 JU? 
Cheerful . 16 .06 
Responsible .0002 .06 
Personal Magnetism .09 adhe 
Trusting 02 SUA 
Self-Possessed 500d ANS iey 
Open-Minded .06 nue 
Challenging 05 .09 
Kindly .08 ae 
Sense of Humour .08 .09 
Pleasant Voice .09 ee 
Vigorous 200 .08 
Consistent .10 . 16 
Helpful .08 il 
Resource ful 705 215 
Expressive 06 lye 
Cooperative sl ieee oot ies ted 
Self-Reliant | .12 21 
Original .02 ee 
Rational -U5 .14 
Alert 219 ahs 
Fluent pels 15 


Force ful .U3 ial 


B. RATINGS GIVEN BY STUDENTS AT THE 


Characteristic 


Responsive 
Stimulating 
Systematic 
Communicative 
Authoritative 
Adaptable 
Democratic 
Purposeful 
Cheerful 
Responsible 
Personal Magnetism 
Trusting 
Self-Possessed 
Open-Minded 
Challenging 
Kindly 

Sense of Humour 
Pleasant Voice 
Vigorous 
Consistent 
Helpful 
Resourceful 
Expressive 
Cooperative 
Self-Reliant 
Original 
Rational 

Alert 

Fluent 
Forceful 


66 
Table 16 


ADMISSION INTERVIEW 


Seniority 
210) 
.04 
.04 
205 
215 
Loa 
nO 
04 
.09 
ae 
m0}e) 
07 
aed 
13 
.O1 
705 
Zang) 
-07 
.03 
.04 
iL8 
at 
eU5 
“05 
.008 
.06 
.08 
POR! 
.07 
.04 


Performance Rating 


004 
$12 
.06 
.07 
.07 
.07 
16 
10 
25 
.006 
.10 
.03 
.05 
12 
.10 
18 
14 
.18 
.07 
.08 
02 
.04 
13 
02 
.06 
.07 
afb 
.08 
.05 
02 


67 


Table 1/7 


C. RATINGS GIVEN BY TEACHERS AT THE 
ADMISSION INTERVIEW 


Characteristic Seniority Performance Rating 
Responsive 23 .09 
Stimulating 207 19 i 
Systematic 05 PALE 
Communicative O16 ao5 49% 
Authoritative .06 a5 
Adaptable 15 ale 
Democratic .08 fais 
Purpose ful .005 Aash & 
Cheerful OL2 02 
Responsible .004 25 * 
Personal Magnetism G03 ahd 
Trusting a6 rs 
Sel f-Possessed ac5 .04 
Open-Minded AO .04 
Challenging 07 +15 
Kindly pit .10 
Sense of Humour o12 .09 
Pleasant Voice POY a5 
Vigorous .04 .002 
Consistent aula .08 
Helpful aUS .08 
Resourceful .06 «16 
Expressive .02 gall 
Cooperative 101 2 
Self-Reliant .18 a0 
Original | .04 .10 
Rational . 16 aby 
Alert aie .05 
Fluent .08 meh) 


Forceful .06 07 


Characteristic 


Responsive 
Stimulating 
Systematic 
Communicative 
Authoritative 
Adaptable 
Democratic 
Purposeful 
Cheer ful 
Responsible 
Personal Magnetism 
Trusting 

Sel f-Possessed 
Open-Minded 
Challenging 
Kindly 

Sense of Humour 
Pleasant Voice 
Vigorous 
Consistent 
Helpful 
Resourceful 
Expressive 
Cooperative 
Sel f-Reliant 
Original 
Rational 

Alert 

Fluent 


Forceful 


68 


Table 18 


D. RATINGS GIVEN BY ADMINISTRATORS 
AT THE ADMISSION INTERVIEW 


Seniority 


a5 
.O1 
Od, 
06 
ncO* 
ne 
.08 
.24 * 
06 
a3) 
Od 
06 
20) 
0c 
.02 
07 
.04 
07; 
.005 
.0008 
107 
Aas: 
=09 
.14 
oy! 
.03 
.04 
.08 
203 
.09 


Performance Rating 


06 
.06 
107 
.006 
a 7 
.06 
205 
{18 
-0§ 
.08 
709 
03 
32 
07 
aie 
07 
(ty 
(03 
OF 
12 
.02 
.08 
.16 
03 
016 
,08 
.04 
07 
.02 
10 


E. RATINGS GIVEN BY SUPERVISING TEACHER AFTER 


Characteristic 


Responsive 
Stimulating 
Systematic 
Communicative 
Authoritative 
Adaptable 
Democratic 
Purposeful 
Cheerful 
Responsible 
Personal Magnetism 
Trusting 

Sel f-Possessed 
Open-Minded 
Challenging 
Kindly 

Sense of Humour 
Pleasant Voice 
Vigorous 
Consistent 
Helpful 
Resourceful 
Expressive 
Cooperative 
Self-Reliant 
Original 
Rational 

Alert 

Fluent 
Forceful 


69 


Table 19 


FIRST PRACTICE TEACHING ASSIGNMENT 


Seniority 
.002 
.04 
.007 
.06 
.04 
0/8 }0) 
.06 
.04 
BUY, 
GP * 
.009 
.04 
aL) 
03 
a3 
.008 
ul 
.06 
AO: 
Uy 
.04 
ni Obs 
.04 
On 
.02 
.009 
.03 
.07 
108 
408 


Performance Rating 


Aalst et 
.14 
eloe* 


.004 


70 


as obtaining the greatest seniority. In addition, good performance 
ratings were earned by those who were rational, consistent and self- 
possessed. Being communicative, expressive and having a pleasant 
voice also characterized the successful applicants. In addition, 
these successful teachers were rated as resourceful, original and 
forceful in faculty ratings of their admission interview performance. 
More generally, those making the best impression upon their school 
principals were those that made the best overall impression on faculty 
members at their admission interview. 


In addition to impressing faculty, these applicants also made 
the best overall impression upon students rating them at the admission 
interview. Students perceived the subsequently successful teachers as 
more kindly and having a more pleasant voice than teachers who achieved 
lower post-graduation performance ratings. 


Administrator ratings at the admission interview yielded only one 
characteristic which related to performance ratings, that is, self- 
possession. This was the only characteristic rated by this group with 
any significant degree of predictive power. 


The teachers who conducted admission interviews had different per- 
ceptions than the other three rater groups, that is, the characteristics 
that made for their subsequently successful teacher had a different 
orientation. For them, the successful applicant was a responsible person, 
one who was purposeful and stimulating, as well as being communicative. 


Surprisingly though, none of these characteristics were found signi- 
ficant in the practice teacher ratings. The characteristics which pre- 
dicted later performance ratings changed completely in the classroom 
setting. The successful teacher was the one rated most helpful and most 
responsive in practice teaching. They were alert, original, and 
vigorous in their work. They were systematic and consistent while being 
democratic. Basically, the teachers obtaining the best principal ratings 
were those who made the best overall impression in their practice teaching 
experience. Interestingly enough, they were also the students who achieved 
the highest grades in their other faculty of education courses. It would 


va 


seem that the characteristics which make for a good student and a 


good practice teacher, also help make for a good teacher in subsequent 
practice. 


These serious weak correlations cannot be easily summarized. The 
positive correlations can be explained "after the fact" with little 
difficulty, but the three negative correlations are counter-intuitive 
and puzzling. There are several explanations for observing these nega- 
tive values, as well as for the small correlations we found in general. 
The most plausible of these explanations results from the very selected 
nature of the group we are observing. This resulting restriction of 
range phenomenon will be discussed after presentation of the remaining 
results. 


For the moment we change the focus of the analysis. In previous 
analyses, we were looking for predictors of seniority and performance 
ratings among practicing teachers. These are acceptable success criteria 
according to the principles of organizational psychology, but we have not 
considered a more basic success criterion, i.e., getting a job in teaching. 
This is certainly a distinction which divides the graduates in a meaningful 
way. It is recognized that there may be many reasons for succeeding or 
failing to be employed five years after graduation which our analyses 
will not reveal. For example, where one wants to live will have a signifi- 
cant impact on employability, yet this is not captured in our analyses. 
Additionally, subject specialty area can have a major impact. Despite 
our cognizance of additional major factors, we have limited ourselves 


to analyzing the variables jin the Queen's interview schedule. 


Table 20 gives the average admission ratings of the currently employed 
and unemployed teachers. The average admission interview ratings which 
faculty members gave the currently employed group were higher on 24 of 
the 30 rating scale dimensions, but they were significantly higher for 
only two dimensions: dependent versus self-reliant and unintelligible 
versus fluent. There was no theoretical basis for expecting these two 
particular dimensions to be the only ones which were significant. It 
is probable that these are significant by chance alone, but there was 
a tendency for currently employed teachers to have obtained better 
admissions interview ratings from faculty members than did those who 


are currently unemployed. 


ih 


Co °G 
L1°S 
LvV’s 
Lo°v 
OZ°S 
66° 


62°S 
Oiae 
BE°G 
ve°s 
L1°S 
vl°s 


80°S 
8o°L 
CaN, 
06°V 
GO0°S 
Ov’s 


peAo| dweun 
abeusay 


Z1°S 
Ley 
9¢°G 
vl*t 
pL°sS 
60°S 


GL°S 
vl*v 
Ov°S 
0Z°S 
L¢°G 
80°S 


£0°S 
Z8°V 
68°P 
Wage 
68° 
87°S 
pekoj du3 


ebeusay 


INSGNLS Ad G3ALVY 


SONI LVY MIITAYSINI 


eth 
90° | 
le 0 
oat 
cor 0 
96°0 


G8°0 
weal 
oe <Q 
G6°0 
00° | 
v6°0 


L6°0 
90° 1 
CLL 
86°0 
Olea 
G8°0 


p.S 


NOISSIWGV -O¢ JI1avl 


O1°S 
G6°v 
c2°S 
Viev 
LZ°S 
91°S 


Zon 

8° 
CaeG 
faite 
vL°s 
91°¢ 


91°S 
L8"v 
c6°P 
90°S 
lo'y 
Z°S 
peko | dweun 


abeusany 


80°S 
p0°S 
€2°S 
Lie 
G2°S 
be, as 


Le°G 
I8*p 
90°S 
82°S 
62°S 
L1°S 


v's 
L6"v 
G0°S 
81°S 
Se'Y 
Zo°G 
pekoj dw3 


ebeusany 


ALINOWS AG G3LVY 


BDIOA juesedald 
unowny $0 e@Suas 
A\puly 
BbuiBbua;y;eu9 
pepulw-uedg 


pessessod=-4]95S 


Bulysnuy 

wSijoubeyw |euosud,y 
81 qi Suodsay 
|N}$499YH 
jnzoesodung 


21 ,e4udoweq 


a|qe,depy 

SAI fe 1 4OULNY 
BA1 fed!) uUNWWOD 
D1 pewa,sAs 
Burpejnw! 4s 


@A1SuOodsey 


OI 1S lYALOWYVHO 


73 


£92 


v6 


BL °S 


po’ v 
G2°S 
62 °G 
v2°S 
99° 
ZO°S 


IS°S 
86° 
00°S 
0 °S 
G0°S 
08*r 
poho | dueun 


abeusAy 


Eel 


¢O°S 


9S°V 
GY*G 
GE°G 
LE°G 
LS°? 
61°S 


ve°s 
L8°v 
68° PV 
Ge°G 
€c°G 
oO 7 


abeusay 


IN3GNLS A@ G3LVe 


GO°QO /d 4e puedlz1uDIS Ss! epewlysa edueluen pajood y4im jfsai-yy 


666 


LO" | 


“oS peko| dweun 


G3NNILNOO O02 J1dvL 


Cac 


v0°S 


99°17 
¢6°V 
OF °G 
Ca eG 
pL’? 
LO*S 


v2°S 
06° 


abeusay 


(GAS 


vl*s 


e8°V 
O2°S 
ae & 
CU°S 
¢8°L 
62°SG 


IS°G 
66°P 
66°0 
v's 
1g °G 
66° 
peko| dwy 


abeuaay 


ALWNOWA AG G3ALVe 


ezis ojdwes wnw! xe 


uolSsseudwt | |e4aag 


|N~oo4o4 
fuan| 4 
f4e1y 
jeuol yey 
peurbrag 


fue! |o4-$1 8S 


8A | 4e419d009 
8A1SSoudx 
|NJeodunoseay 
[nyd}sy 
{USS 1SUOD 


SNOUOD IA 


OILS tHALOWYVHO 


74 


As one can see, however, this was not true for the interview 
ratings given by the student interviewers. Students rated the employed 
group higher on half the characteristics, and lower on the remaining 
half. Overall, though, they tended to perceive the currently employed 
group in a slightly more negative way. Despite this perception, the 
only two significant differences observed were both in favour of the 
employed group. The employed group was rated as significantly more 
systematic and purposeful in their interview performance. These skills 
could certainly yield a slight advantage in "job-hunting" after gradua- 
tion. Perhaps it is the ability to find work rather than the ability 
to teach which these ratings reflect. 


Table 21 provides a contrast to these results. The raters in both 
cases here are practicing teachers, but the first set of ratings was 
based upon an admission interview and the second set upon observing the 
student teacher in a classroom setting. The contrast between these 


settings was marked. 


The teacher-based interview presented a pattern similar to the other 
interview ratings, however, there were significant differences between 
comparison groups in any context. Even the overall impression of the 


two groups was similar. 


On the basis of practice teaching observations, however, the two 
groups were significantly different on a number Of Characlenisligsces ton 
all 17 significant characteristics, the currently employed group was 
rated more positively than the unemployed group. In fact, for 29: oUt. On 
30 comparisons, the employed group was rated more positively. A comparison 
of the grades. of currently employed (3,679) versus unemployed (3,647) 
graduates also revealed a significant difference in favour of the employed 
group (t= 2,09, p .05)<# Ini this casessandaam the others, the statistic- 
ally significant differences were not large enough to be truly diagnostic. 
Nonetheless, they suggested a number of possibilities and lead us to raise 
a number of substantive issues. 


75 


SYAHOWIL Ad SONI LVY ALI IWNOSY3d 


LOSI 86°V CO°S Lor t ¢l°S 
ie! 00°S 86°” Vee i 6°” 
€6°0 OC°S bang 66°0 LCS 
com 8b°V L9o°P Ot 6G° 1 
86°0O GI°S eo°S L6°0 60°SG 
80° 66°V Li °sS Cll 08°? 
68°0O 6L°S VOrG~: 76°0 60°S 
S72 t 6v°v 69° Lol 9L°t 
GsO*l L1°S LES G6°O 61°S 
Ot st bvO°S 0Z°S G6°0 Go°G 
SsO*l 86° OL°S ho vil°s 
66° 0 90°S 60°S 66°0 pO°S 
GO" LO*G 60°S 66°0 90°S 
ory it cv G9° PL 9 Pat Co ex 
ola 8G° 89° PL Lest O8°v 
cid 8b G6°V O° | G6°v 
9° LG°Y CLV OF | O6°V 
¢0°L vos Lies 68°O GC" 
“ps Pakojdueuyy Poko] Guy prs pako;dueun 
abeuany abeusay aDeUaAy 
ONIHOWSL S3LVIOOSSV NI G31Vu M31 AYSINI 


O1°S 
L8°p 
CGeG 
CL°Y 
v2°S 
86° 


G2°S 
OL ae 
62°S 
G1°S 
Li°sS 
E1°S 


60°S 
99°Y 
8"b 
G0°S 
68°P 
058 
pako; du3 


abeUsAy 


NOISSIWOV LV G31LVY 


[12 a1avL 


BOIOA fuesed{d 
dnowny 4$O @Suas 
A|puly 
buibuey;eug 
papulw-usd¢g 


pessessod-4|9as 


Bulysnay 

wSij,oaubew |euosuad 
a|qisuodsey 
{N}4BBYH 
inzasodungd 


D1 ,esoowaq 


e|qeidepy 

SA PefL14oyny 
SAI PJed!uNWWOD 
Dt pewua,sAs 
Buipeynwi4s 


aA 1SUOdSaYy 


OILS 1YSLOVYVHO 


76 


vlél 


00° | 


va 
G6°0 
Lo°l 
¢8°0 
Slt 
Bias 


¢8°0 
Piles 
Ore | 
6L°0 
Z6°0 
Gi°l 


yjoyvs 


60S 


98° 


82° v 
80°S 
€0°S 
omG 
¢G*t 
6L°V 


Ge°G 
09° 
LL°v 
8Z°S 
C1°S 
GS*Y 
pekoj dusun 


abeusay 


GOL 


vO0°S 


66°90 
O02°S 
v2°S 
Ov°S 
bl*v 
GO°S 


60°S 
08"? 
66°90 
97°S 
GZ°G 
08°? 
peko| dua 


abeusany 


ONIHOWSL S3LVIOOSSV NI G4LVdY 


GO°Q Wd ye yuedizyluBis Ss! apewlyse aouel4eA pajood YyIM 4Sa4-ly 


B82e 


Ca aak 


Spies peko| dwaun 


M3tAYSLNI 


G3NNILNOO Té 31aVl 


Aa 


10°S 


69° 
G1°S 
O¢ *G 
CC'S 
Leow 
c8°V 


Ly°s 
¢0°S 
L8°V 
G¢°G 
60°S 
L8°v 


abeusay 


961 


00°S 


89°P 
61°S 
92°S 
0g °S 
GL°Y 
Loney. 


LE°S 
G8°P 
96° 
Iss 
ies 
<8'y 
pero} duw3 


abeusay 


NOISSIWGW Lv Q31Vd 


eZ!1S ajdwes wnw!xey 


uo!lsseudw! | |e4sAQ 


|NpJaeoduo4 
fuen| 4 
f491¥ 
jeuol sey 
peurb14g 


fue! |e4-} | 8S 


@A1fe4adoo4) 
@A1SSe4dx A 
|NJ,eounosay 

IN}d| oH 
Juats}suog 


snouobi A 


OILS 1YALOWYVHO 


iW 


First, they suggest the obvious fact that an admission interview 
is vastly different from an associate teaching experience. In the 
interview, the raters must operate on superficial impressions of the 
student, and they must try to imagine how good a teacher this person 
would make. In practice teaching, the rater has a chance to actually 
observe the person's competence, not just to think of how good they 
might be. 


It is axiomatic, then, that stronger opinions should be formed 
in the classroom setting. That these classroom ratings predict who 
will later be employed is not surprising for two reasons. First, a 
sample of two persons' work on day 10 is the best predictor of their 
work on day 100. Secondly, the classroom experience is closer in time 
than the admission interview to our study observations. As a result, 
fewer factors are likely to have intervened and distorted pre-existing 
relationships. 


Two simple inferences can be made from observed employed-unemployed 
contrasts and from correlations between success criteria and earlier 
performances. First, those who were hired were the best of the students 
and, thus, only the best members of the class are presently employed. 
Secondly, of those who are employed, the most successful are those who 
were the most successful students. That is, those who received the 
best grades and the best associate teacher ratings are most successful 
at getting jobs, retaining jobs, and being considered good at those jobs. 


This is a very appealing conclusion but is probably too simplistic. 
It seems equally likely that the people who obtained jobs were those who 
performed better in the Queen's program. This is to say that when school 
boards hired teachers, they tended to hire those with the best academic 
grades and best practice teacher ratings. Certainly this is to be expected, 
and explains the observed differences somewhat more satisfactorily. In 
short, admission interviews do not predict later employment because they 
are not a basis for hiring. Academic grades and associate teacher ratings 
predict later employment and it is here assumed that these factors are the 


basis for employment decisions. 


78 


In the final analysis we do not know definitively whether the 
employed teachers are any better than the unemployed group. The issue 
hinges entirely upon the validity of the rating scale and particularly 
upon what rating differences are meaningful. In order to validate the 
instrument we must have some criteria for what makes a good teacher. 


Our first assumption was that principals are in a position to 
globally assess teacher effectiveness and that their assessment of 
teachers' effectiveness is indeed valid. Without explicit criteria to 
define attributes of good teachers, one is left with using data assumed 
to relate positively to "good teaching". An argument could be advanced 
that such evidence is circular in nature and thus misses the essential 
focus of what such a study should focus upon. Hence, without an explicit 
construct defined as to what characterizes a "good teacher" and an identi- 
fication of variables that lend themselves to reliable measurement, one 


may overlook relevant data. 


As an addendum, we would like to raise yet another poss 1Did ity webt 
is possible that real differences do exist between the employed and unemployed 
group. The reason that these may go undetected is "rating-inflation’. 
Raters have a perverse tendency these days to want to rate everyone posi- 
tively. Thus an average, a good, or an excellent performer is rated a 
six or a five. The rest of the scale goes unused and our ability to dis- 
tinguish between qualities of performance is lostea Wesof famenoepyactical 
solution to this problem other than a request for more meaningful ratings. 
Whether this will help us to distinguish good from bad in teaching, however, 
is doubtful. It is only by establishing standards of practice that we can 
really begin to tackle this issue. As it stands now, our only standard is 


seniority. 


These comments have by-passed consideration of the correlational 
results. These cannot be explained in terms of hiring criteria, but 
the weakness of these relationships suggest two alternative Stats saelCagl 
explanations. 


Using seniority as a success criterion, 15 correlations were found 


significant. By chance alone, statistical theory predicts that eight 


79 


correlations will be significant, even though there are no true dif- 
ferences in groups underlying this observed significance. Considering 
that this data consists of five independent comparisons of the same 

31 variables, and that only one variable (self-possessed) was signifi- 
cant in more than one comparison, it seems likely that chance has 
played a major role in the results obtained in these analyses. 


The other phenomenon of relevance here is restriction of range or 
pre-selection. The use of correlational or variance-based analysis 
methods is predicated upon the assumption that all variables are normally 
distributed in the population of interest. Each time we eliminate some 
people from further consideration, this assumption of normality becomes 
less tenable. Consider the selection. process for this group: Az of 
high school students apply and are selected for admission to university; 
B% of this A% obtain degrees; C% of the degree group apply and are 
accepted to the faculty of education; D% of this group graduate; E% 
obtain jobs. In the original case (high school students), the A% had 
a relatively normal distribution of abilities and there were a lot of 
differences between applicants. The A% had different career interests 
and goals and attended a wide variety of schools. By the time we reach 
our final stage, however, the group is much narrower. Now we are 
referring to people with enough ability and interest to get a Bachelor's 
degree. This is not a very large proportion of the population at large. 
In addition, we are only referring to degree holders interested in 
attending a single B.Ed. program in Kingston, Ontario. Thus the group 
is further homogenized by their common interest in teaching and their 
willingness to attend a program in Kingston. This group is unlikely 
to vary widely in ability or interest compared to the group we started 
with. As a result of this lack of variation, we observe very weak 


correlations. 


Given that these people are fairly homogenous, it may well be that 
one graduate is as intellectually capable of teaching as another grad- 
uate. If this is the case, then we could select teachers by lottery 


and do as well as we are doing now. 


Before anyone gets too angry about this suggestion, let us clarify 
that we are not posing it as a real alternative. What we are doing instead 


80 


is suggesting the bankruptcy of the classical norm-referenced predic- 
tive approach for purposes of teacher selection and for purposes of 
identifying the successful teacher. What is required now are value 
oriented studies; studies where a group of educators begin to decide 
what constitutes effective teaching. The empirical approach cannot 
really establish what constitutes effective performance, unless some 


common definition of "effective" is operative. 


Even this approach is too simple. The compiete issue is what 
constitutes effective teaching under what circumstances. An effective 
science teacher may differ markedly from an effective English teacher. 

An effective English teacher in one neighbourhood may be ineffective 

in a neighbourhood of different social class or ethnic background. 
Alternatively, it may be that most teachers are effective. The task 

then is to identify those that are ineffective (under what circumstances ) 


and to establish minimum competency standards. 


The task then is not an easy one. Rather than predict, we must 
systematically provide the conceptualization to formulate the characteris- 
tics of a "good" teacher and define our educational needs. What should 
a teacher be able to do and how can we know if they are able to do it? 

We offer no easy answers, only a help in focus. An optimal situation 
would be to determine what a starting teacher must be able to do at time 
of hiring. Future needs cannot be easily anticipated without gross 
errors. Also, if current needs can be satisfied by existent skills, then 
other skills can be developed on a "need-to-know" basis. 


Without. solutions to current educational staffing problems, we may 
expect things to continue as they are. Jobs will be retained on the 
basis of seniority, and little new blood will enter the system at all. 
We will continue avoiding the issue of effectiveness, unless it implies 
that everyone in the system can retain their jobs. The consequences 
of this policy are unknown. The average age of teachers will continue 
to rise and it will increasingly differ from that of students. If 
identification with the teacher or modelling of teacher behaviour jis an 
important force in education, then the quality of education wil] 


81 


gradually decline. If this is not the case, what is? How will new 
ideas work their way into the school system? How will a group ac- 
customed to teaching by 1978 methods respond to the educational 
challenges of the 80's and 90's? How will the scarcity of educational 
jobs affect the quality of new teachers? 


We raise this host of questions after considering the findings 
of the Queen's study. Only one thing is apparent. The answer to these 
questions will have a major impact on our society. It is up to us 
either to create our future through affirmative action or to find our- 
selves the victims of forces we don't yet understand. The decision 
depends upon the educators of this province. 


a ie if a) if _ 
LJibet, siotHes, Sank Jory. ee ei a7 Sams aaits 
mics. 4 1, wo Theta z a af. eine i K ale ; i we 
SeathS ett a8) bi iggegees zt bilaaon. ASSET as ‘pataae ed pt oi 

s. tot ornag, att fie aH. Te” OF Doe DP att 40. 2eyme 


‘ 


rh Cpuatpestowalt te WELT £1 whi wan 'f. 


as ¥ ok? Oreveboanan tf amote rae Te Oxo eta aati: “a - a 
pri ‘ taeey anid ety vint§ .vybuke 2 ‘nesul an Whe 
| vgelaos ‘wo no Magid sokam & evel dwienals wel 

i%,.92 se.ent tie sullen) tte Rintet) Suing, win. eftowd Oe ifts, + 
isoh od! Gant tyahow Jey. 7, 10k OW roti 70, “att? itv aff? ee ba 
. ee 


7¥ 
~~ 


yyy 1 aK ha ult Haein 2 


; 
f "7" a" pe tems « vi 
Zhe) can —_ on Bie 


annie ol ayeten ra 


REFERENCES 


Bolton, D. L. Instructor's Gutde for Use of Simulation Materials for 
Teacher Selectton. Columbus: University Council for Educational 
Administration, 1970. 

Bolton, D. L. Selection and Evaluation of Teachers, Berkeley: 
McCutchan, 1973. 

Brown, I., Weinstein, E. L., & Wahlstrom, M. W. Admtssion and Selection 
Procedures for Nurstng: Literature Review and Annotated 
Btbltography. Toronto: The Ontario Institute for Studies in 
Education, 1978, 118 pp. | 

Campbell, D. T., & Fiske, D. W. Convergent and divergent validation by 
the multitrait-multimethod matrix. Psychological Bulletin, 1959, 
56, 81-105. 

Chauncey, H. & Frederiksen, N. The functions of measurement in educational 
placement. In E. F. Lindquist (Ed.) Educattonal Measurement. 
Washington, D. C.: American Council on Education, 1951, pp. 85-116. 

Cronbach, L. J. Test Validation. In R. L. Thorndike (Ed.) Educational 
Measurement (2nd ed.). Washington, D. C.: American Council on 
Education, 1971, pp. 443-507. 

Cronbach, L. J., & Gleser, G. Psychologteal tests and personal decistons 
(2nd ed.). Urbana: University of Illinois Press, 1965. 

Davis, J. A. Faculty perceptions of students:I. The development of the 
Student Rating Form. Educational Testing Service Research 
Bulletin, 1964, No.10. 

Davis, J. A. Faculty perceptions of students:VI. Characteristics of 
students for whom there is faculty agreement on desirability. 
Educational Testing Service Research Bulletin, 1966, No.28. 


Fishman, J. A. Unsolved Criterion Problems in the Selection of 
College Students. Harvard Educational Review, 1958, 28, 340-349. 

Gough, Hs CC. Hall, W. B.; B-Harris, Roc, Admissions procedures as 
forecasters of performance in medical training. Journal of 
Medical Education, 1963, 38, 983-998. 

Guilford, J. P. Psychometric methods (2nd ed.), New York: McGraw-Hill, 


1954. 


Hills, J. R. Use of measurement in selection and placement. In 
R. L. Thorndike (Ed.) Educational Measurement (2nd ed). 
Washington, D. C.: American Council on Education, 1971, 
pp. 680-732. 

Hills, J. R., Gladney, & Klock, J. A. Setting cutoff scores in 
selective admissions. Journal of Educational Measurement, 
T9G7i 2A 2k? 6 3: 

Hilton, T. L., & Myers, A. E. Growth study II. Personal background, 
experience, and school achievement: An investigation of the 
contribution of questionnaire data to academic prediction. 
Journal of Edueattonal Measurement, 1967, 4, 69-80. 

Horst, P. A technique for the development of a differential prediction 
battery. Psychological Monographs, 1954, 68 (Whole No. 380). 

Horst, P. A technique for the development of a multiple absolute 
prediction battery. Psychologteal Monographs, 1955, 69 
(Whole No. 390). 

Horst, P. Psychological measurement and prediction. Belmont, Calif.: 
Wadsworth, 1966. 

Kelly, E. L. Theory and techniques of assessment. Annual Review of 
Psychology, 1954, 5, 286-290. 

Klein, S. P., & Hart, F. M. The nature of essay grades in law school. 
Educational Testing Service Research Bulletin, 1968, No. 6. 

Lavin, D. E. The prediction of academic performance. New York: Russell 
Sage Foundation, 1965. 

Lindquist, E. F. An evaluation of a technique for scaling high school 
grades to improve prediction of college success. Educational 
and Psychological Measurement, 1963, 23, 623-646. 

Linn, R. L. Grade adjustments for prediction of academic performance: 
A review. Journal of Educational Measurement, 1966, 3, 313-329. 

Lunneborg, P. W., & Lunneborg, C. E. The differential prediction of 
college grades from biographical information. Educattonal and 
Psychological Measurement, 1966, 26, 91/-925. 

Ryans, D. G. Characteristics of Teachers: Thetr Desertptton, Compartson, 
and Appraisal. Washington, D. C.: American Council on 
Education, 1960. 


Sawyer, J. Measurement and prediction : clinical and statistical. 
Psychologteal Revtew, 1966, 66, 178-200. 


Sechrest, L. Incremental validity: A recommendation. Educational and 
Psychological Measurement, 1963, 23, 153-158. 

Smith, M. B. Explorations in competence: A study of Peace Corps 
teachers in Ghana. American Psychologist, 1966, Cl, 099-506. 

Weinstein, E. L., Brown, I., & Wahlstrom, M. W. A Revtew of Admission 
and Selection Procedures for Diploma Nursing Programs in Colleges 
of Applied Arts and Technology: Program Survey and Recommendations. 
Contract Research report submitted to the Ontario Ministry of 
Colleges and Universities, 1976, 125 pp. 


i - PAN 
ir, 
ae 
, Brier ce . 
.' Bat le'case Le nia bdtido od, agts y3") . aNy, f 
. A eo iah eer enti 
oT)’ ao ye POE : a rath 
We: nS ae ia 1 
q ~~ ¢ a 
y (UES, fm, beers sete afl *. a a : 
“i an Z ! 
La — welt ha eae at « want 2 at : + 
ena 
4 oF 


vy, \. nantes Pre re Oe i % ou 


| ‘ = woe 
j ey -_ ae WA A aT yy he he “7 wet ads acinar op - ne a ; _ 
va yt Tt yey, es: ae ay a 7? ‘y At af! 
1&) i DTT IMP Say Vid oF tm | i i bat fi i \% - ihe te By 


Oy), paket sali Prmort sh wee 


7 nar van” 

2 : j i, >i or ie 
i 4, PS. Ae ' 

S W.. 7 

- aid 

. A ee 3 

re A 

Q ‘ a xy are 

at / 

Jf. gi Wi Ve sian 
a ery fis 6S 

oa ay, oa mrss tusseT 


APPENDICES 


APPENDIX I 


IMPORTANT ATTRIBUTES 


~ RANK 
Lil 
S 2 
md 22) zw 1oF2 30 4 
x a Bane 9s o ORGANIZATION; 
9) £2 ica AYLnZ f i 
~<OrGznSSw ,Frsaects denereser4 
TSe AZz2Z00FKkH Seas SOCIABILITY 5 4 | 
Omwmct 4 ro) ze4wsas ar fee 
=o ,OOCw Ll Ares a 62S 2 has Mg 
Sb SPER Ze os 2b e 5 sees ORIGINALITY po) 
RL ef ae Me Ae eee ye a ae nee an 
ad) it [ EMPATHY | “4 
TURD DD dad dade Pi 
= sb) ep fl 
BUOYANCY 7 .), 4 
i ne \j 
ors NAME | "DATE Teaiiga 
A. PLANNING AND ORGANIZING Jo. D. INSTRUCTION l¢ G. OUT- OF- CLASSROOM ~ 
CLASSWORK | PROFESSIONAL ACTIVITIES 
| purposeful : aimless alert : apathetic | active : evading 
} er? ee | ee j vo at 3 t am 
systematic : disorganized resourceful : inflexible responsible ; irresoonsible 
cooperative : antagonistic poised ; agitable | skillful : unskillful 
} y ft os \; i j Yoh C x 
| mbek teidel Hoc id dl | Le oe ee 
| original : unimaginative helpful : hindering accurate ; inexact 
Paws L esp ee ie Sa y tabiGe ait f 
Po oiddit Rbegddga | PA 
wii . inspirational : uninspiring punctual : tardy 
|B. CLASSROOM MANAGEMENT ee lee Apres 
| precise comm. : fuzzy constructive : antagonistic 
| j f H H (: ! ; at 
| i Hl ] ! " j 1! jo | 
| punctual ; tardy pleasant : harsh progressive : stagnant 
| | Ue 7 . r ' 
| Fa uae $ittii | Pa 
| controlled ; disorderly a Se A RE ABT MES Ones ced | ic ee ee 
| Pia rics E. EVALUATION H. RELATIONS WITH STAFF 
| consistent : inconsistent AND PARENTS 
| den bu bd 
| flexible : fixed continuous : erratic approachable : aloct 
‘ A if fy | t {| | £¢ 
| ots el fr a Piet 
| fair : partial rational ; irrational | cooperative ; uncooperative 
| ‘ i n "1 { A n if ' (] f f 
| i F a 4 ty f i i ab tt H ait 
1 responsive : indifferent systematic : disorganized discreet : imprudent 
| ; ah abe fh | i! fs ‘ 
| ri Petit | Indi 
} just ; inequitable | responsible ; irresponsible 
— C. CREATING A MOTIVATIONAL oe ae | ‘ 
| ENVIRONMENT an a oa Sa + open-minded : narrew minded 
F. GUIDANCE AND COUNSELING |  fotngbaes 
steady : spasmodic effective : ineffective 
PHER Dae pi 
flexible : rigid resourceful : trite 7 aa ag Oey aa 
| Peptic? epee J. SCHOOL-COMMUNITY, 
broad-minded : narrow-minded ratent : aa RELATIONS 
ie eae Lyp ara 
sensitive : unfeeling approachable : ~ Feral active : inactive 
r co 2 ae haa LL eal 
( ‘a Lied Pes edad and 
kindly : critical lad aes Uns ere initiates : follows 
ie eee Pigg : iv 
sense ot humor : humorless eiseaaentive : ha Ga ee effective : ineftective 
Chie ers te ' jan A 
Pi Pia Pid fe fee ee aac 


: dull 


1d 7 | 


stimulating 


fon 
: \ 
$j 


a eke 


APPLICANT Student Number 


PENDIX 
INTERVIEW ASSESSMENT 


ee ee 
Initials cla ea 


DAY MO, YEAR 


oli cae 


Berm et GCueGMmer 


INTERVIEWER No, 36 ei Status 39 ES ( Code: T — Teacher, A — Administrator, F — Faculty, S — Student’ 


Signature 


PERSONAL DIMENSIONS 


Indicate the rating for each dimension by marking an ‘X' in the appropriate box on each line. 


Partial 40 Fair 
Autocratic 4) Democratic 
Aloof 42 Responsive 
Restricted 43 Understanding 
Harsh 44 Kindly 

Dull 45 Stimulating 
Stereotyped 46 Original 
Apathetic 47 Alert 
Unimpressive 48 Attractive 
Monotonous 49 Pleasant (voice) 
Inarticulate 90 Articulate 
Evading 5 Responsible 
Erratic Oe Steady 
Excitable 53 Poised 
Uncertain 54 Confident 
Disorganized 55 Systematic 
Inflexible 56 Adaptable 
Pessimistic 57 Optimistic 
Immature 58 Integrated 
Narrow 59 Broad 


GLOBAL ATTRIBUTES 


EMPATHY 
tee + BURA eRe Cxnnd 


Aloof, Egocentric, Restricted. 60 | WIGS 4 | | | Kindly, Understanding, Friendly. 


ORGANIZATION 


1 2 919) -¢osuasa 7 
Evading, Unplanned, Slipshod. 61 | | | | | | | Responsible, Systematic, Businesslike. 


VITA RY. 
1 BeretDh.  AeideyeGipag 7 
Dull, Routine, Lacklustre. 62 [_ a ee ae Stimulating, Imaginative, Surgent. 


hk oak A ee Oe 
Unsuitable 63 amie: | Outstanding 


OVERALL FINAL ASSESSMENT ———— 
| 
| 


SUPPLEMENTARY INFORMATION 


ORAL Se ENGLISH 


Please rate the candidate's proficiency in speaking the English language. 


ply 2d ee eNOS 
Low proficiency 64 | hea | | | | High proficiency 


If the candidate scored less than 4, describe in a few words the nature of the shortcomings. 


pe COMPETENCE IN OTHER LANGUAGES. 


Please ascertain the candidate's competence in other languages, indicating whether it is 
in the spoken or written form ( or both ) and estimating the level of proficiency. 


3. OTHER SUPPORTING TALENTS. 


Indicate whether the candidate has skills or talents that might be useful for a teacher, 
20. music, al, Sports, etc. 


4, OTHER COMMENTS 


' oa] 


¥ | 7 
ee F 
resign) 9 Wl re prvialemmpy vt. emt | he ats wee at 


_— — ‘i 7 i 


; bat ne a4 : he . — 
conte CLE Te eae 


as - ~~ . = -_ — > Bad 
iat syitew Work op e4ioash. 2 neil) aan) ha STRAT IBANS F . 
f 7 7 1 a | 
f 


ee 


SOAS srigounom? 7M 


ma, TW 344 nt pts La AT? bel 2: “ON ape tiol y ct 
eves itein teeta fire 10} neil eee fe 


~~ > 7 
7 
aes : - 
oa 7 7 i 
— ia - em 
_"% 
a 
7 P 
‘uy ra etyrsift wer Lie é 
7 
——— 7 _ 
A —— SS ee 
ee — _>—— a — 


SARTHU CGE 
APPENDIX 3 MCARTHUR COLL 


PROCEDURES FOR INTERVIEWING 995878 


A. GENERAL GUIDE-LINES 
Each interviewing panei will be headed up by a McArthur College 
faculty member who will act as chairman, and in that capacity 
serve as the university's official representative in all 


matters pertaining to the work of the interviewing team. 


In keeping with the College's commitment to humane processes, 
interviewers will seek to make the applicant as comfortable 
as circumstances allow and endeavour to minimize the level of 


anxiety and stress during the interview. 


Since a number of applicants may fail to earn admission to 
McArthur as a result of their interview ratings, it is vital 
that all interviewees be accorded every courtesy and con- 
Sideration in order for them to make as. favourable an impression 
as possible and to leave with the feeling that they were given 
a sympathetic hearing. 

One specific illustration of the foregoing principle 
is to provide the applicant with an opportunity, toward the 
conclusion of the interview, to relate anything significant 
about himself that may not have been touched on during the 


discussion up to that point. 


Although McArthur representatives will attempt to respond to 
any questions asked by the applicant, they will bear in mind 
that "the individual who does the talking is actually the 
person being interviewed". Therefore, they will try to 
structure the discussion so that the applicant, in fact, 


does most of the speaking during the interview. 


Interviews will normally last 20 minutes, allowing another 
10 minutes for the necessary evaluative and administrative 


actiyities. 


Interviewing teams are to proceed on the assumption that all 
interviewees have the necessary academic qualifications for 
admission. The interview need not, indeed should not, be 


focussed on evaluating the candidate's subject area competence. 


Only the chairman of the interviewing panel will have access 

to candidates' application forms, but he is encouraged to 

share with his fellow-interviewers relevant information about 
the candidates that may be helpful in the conduct of the inter- 
views. For instance, a reference on the Personal Data Sheet 

to summer work with youngsters may serve as a point cf entry 


into a discussion about the candidate's leadership experiences. 


In the case of some applicants, it will assist decision-making 
by the AdmisSions Committee if there were an indication of a 
third Curriculum choice in the event that their second option 
choices are closed. Chairmen are asked to offer an cpportunity 
to every candidates to tdenti ty ssvchea seni ome oe yy ee Io 


choice ought then to be recorded on the application form. 


Certain precautions are to be observed in the use of the 


Interview Assessment sheets: 


(1) Candidates are not to be shown the Interview 


Assessment sheets; 


is. - 


(2) Although interviewers should feel free to jot 


(3) 


down notes during the interview, they will 
complete the summary form itself only after 


the candidate's departure. 


It is critically important that interviewers 
arrive at their own judgments independently 
of each other; therefore, interview summaries 
ought to be completed prior to any comparison 


of impressions. 


Be 


INSTRUCTIONS Re INTERVIEW ASSESSMENT FORM 


Interviewers are asked to make certain that they provide all 


the data requested at the top of the Assessment form, regarding 


the applicant and the interviewer. The key-punch operators 


will require this information to do a satisfactory job in the 


preparation of computer cards. 


The thirty bipolar: items under Personal Dimensions relate to 


the six important personal/professional attributes (listed 


in the following section on the form). The five blocks of 


six items each were set up for purposes of readability alone ~ 


and are not organized on any topical basis. 


Since individuals' future careers may depend on the 


interviewers' objectivity and thoroughness, the following 


guide-lines ought to be observed: 


(1) 


(2) 


(3) 


Rate each dimension independently. Avoid a 
Syndrome approach which results in pattern 
responses. (Particularly when fatigued, 
resist the tendency toward uniform rating 


on all dimensions.) 


Use the full scale. All seven intervals in the 
continuum are to be considered. The end-points 
are to be utilized, and the mid-point is not to 


be over-utilized. 


Our advice from measurement experts is thet inter- 
viewers should be urged to attempt to complete 
every item if at all possible. It is crucial to 
have the data quite complete. | 


The foregoing instructions have been intentionally 
expressed in a very positive, pointed way so as to emphasize 
the need for interviewers to be discriminating and decisive. 

But a word of reassurance may now be in order. The 
interview assessment instrument deals with approximations 
and does not call for agonizing appraisal by the rater. 
Remember that your individual rating of any particular 
dimension constitutes a relatively small element in the 
total admissions procedure. So do your careful best, and 


then relax. 


The section titled Order of Important Attributes asks 
interviewers to rank-order the six major factors, from 

1 to 6 with no ties. Since this section was designed to 
provide internal validation in subsequent research, inter- 
viewers can assume that their responses here will not affect 


decision-making about admission. 


The Comments portion is optional and provides an opportunity 
for the inclusion of important information not covered in 


previous sections or in the application form. 


C. SUGGESTIONS Re INTERVIEWING STRATEGY 


Many of the specific qualities included in the interview 
summary form will be manifested rather naturally during the 
conversation and will require no special strategy. Qualities 
of that sort include expressiveness, sense of humour, and so 
FOUEN. 

But other characteristics may have to be ascertained 
by inference as the result of rather careful questioning. 

ttributes such as empathy, leadership, originality and 
organizational ability will require that kind of SkLo eu, 
indirect approach. 

To assist interviewing panels in the conduct of their 
discussions it is suggested that most interviews will need to 
honour certain natural stages. Although specific questions 
and topics of discussion will vary from case to case, the 
general design of most interviews will Racorrorate these 


features. 


STAGE ONE - REDUCTION OF TENSION 


(1) Introductions and explanations 
(2) Casual conversation 
(3) Identification of a third curriculum option 


TAGE TWO - LEADERSHIP QUALITIES 
(1) Invite the candidate to describe some leadership 


experiences that he has had. 


--_- 
fo 
~~ 


Then ask the candidate to recall some problems or 
difficulties he encountered. e.g. What was the hardest 
thing you had to handle as a member of the Students’ 


Council? 


(3) Finally, to assess qualities such as decisiveness and 
assumption of responsibility, request the candidate 
to tell how he coped with the Situation. 


e.g. Well, as camp counsellor what did you do in 


that emergency? 


STAGE THREE - ORIGINALITY 

In order to gauge the candidate's capacity for flexible, 
imaginative thinking, it is important to avoid topics where 
Simple recall or resort to conventional wisdom will suffice. 
There emerges a need for the novel and the unanticipated, 
such as hypothetical situations. 

e.g. How would you respond if your students ask to 


participate in the planning of their courses? 


e.g. Suppose you're teaching Grades 7 and 8, and the 


kidsrdonltilikesthaty particular unit. 


e.g. (For a non-Math candidate) How do you think 


Mathematics ought to be taught in High School? 


e.g. Describe how you would like to organize your 


own classroom's physical arrangements. 


e.g. What alternatives are there to traditional 


examinations and grading? 


e.g. Suppose the Department Head has imposed a course 


outline, and it won't go with the kids in your class. 


e.g. What steps would you take if you discover that you 
have an inadequate background in one of your 


teaching fields? 


mas f= 


e.g. There are two students talking at the back of 
the class. What alternative courses of action 


are open to you as their teacher? 


e.g. Suppose the students in your school don't seem 


to care about pasSing.... 


STAGE FOUR - EMPATHY 


Here again the interviewers may aim at illustrations 
from previous experience or responses to hypothetical problems. 

TlMustration of the former: 

e.g. How did you get along with your room-mates at 


Lie scO- om. 


e.g. How did you resolve the interpersonal difficulty 


with your supervisor at work? 


Examples of the latter: | 
e.g. Are students more difficult to discipline to-day 


than they used to be? 


e.g. Suppose you have a "muiti-problem" kid in your 
class (inattentive, sleepy, failing, needs glasses, 
works nights in the bowling-alley, father is 
crippled, a large family, on welfare), how would 


you as the teacher respond? 


e.g. Suppose a kid in your class gets beaten up on 


the way home from school by some of hus) class-mates. 


e.g. You suspect some students in your class are on 


CBU Sarerese 


STAGE FIVE - ORGANIZATIONAL ABILITY 


e.g. How did you make your career-choice? 
e.g. How did you happen to choose your university? 


e.g. What are your reasons for attending this 


McArthur interview? 


e.g. What are some of the necessary stages in the 


organizing of a field trip? 


e.g. What are the principal steps in applying for a 


teaching position? 


STAGE SIX -— CONCLUSION 


(1) Is there anything else that you would like 


to say about yourself that we haven't touched on? 
(2) Clarification of next steps in admissions procedures 


(3) Good-byes 


It will be understood that the stages identified 
above will not necessarily follow each other in the same 
consecutive order as shown here. 

A final word of advice: it would make good sense 
for the three members of the panel to spend about an hour 
together going over these guide-lines and agreeing on a 


mode of operation. 


heooug #rmoiee nbs mi aqysee tua to Rettgo litres. $) Gia 


rT 


iav ial” euoy asaods | dl 


ead Oi is Abese' 
ramped 
. ~e iv ohh ie US : 
ef? ai @dpgie yxedasuen outed to ome ve 38 coe 
Z oO : heirs f er - 
cat bio 5 tO. intnneso : ; 
. wy’ wee. .y srl petra 


a = 
mityvlods 2. sdege lage *q, ont PER: . at 1 Peni 
i” 
| ‘561 See "eihdaaes Rs ty 
| ae a on” 


6 


_*, 


ii : a 6 bil i. \y* I ye ee, = = 
wataID > i een | 
¢ yoy Jeds gele paisiavas esads 23 11). 
: nu Poe ; F ; - ~ re Mon rey 
before ved sw dott Iiewsscy. tvode yee of 


j ¥ ‘ _ 


bettitnebi sopete velz- - tee, oan: a saw at, sat 
4 he 
dmse’ sits hk Solhh Hone oh ae o> glial Ritiad Lodhi sae 


r 
ie 


n ad a be > ee . , 
7 22 ait nwioca cy. 
sense ‘bcs ote Sistw gh» yn Yo bx 


~ J y Z s Md 
tod ws TVORE" om ye" Og teheg ert bie hive 


s no Bitters’ bast be elo bile “Vana re 


o >a 
a ! in) @ ; 
4 a 2 : \ ; 
\ the 
uw 4 
5 | j . 
+ 
yi 
y 7 


APPENDIX 4 


INFORMATION se TEACHING ASSESSMENT FORM’ 


BACKGROUND °: 


This is a word of explanation about the long yellow 
sheets. Briefly, we are conducting a follow-up study on this 
year's candidates to see how well our selection procedures 
predicted their success as teachers. All of our associate 
teachers and faculty supervisors are being asked to complete 
one form for each candidate whose teaching performance they 
observe in the schools. 


Candidates are not to be shown your assessment of their 
performance on the Teaching Assessment form. Your ratings on 
this instrument will have no bearing whatever on candidates' 
academic grades. 


GUIDE-LINES 
1. Simply ignore Student Number and Supervisor No. 


2. The thirty bipolar items under Personal Dimensions relate 
to six important attributes: Empathy, Organization, 
Leadership, Buoyancy, Originality, and Professional 
Impression. The five blocks of six items each were set 
up for purposes of readability alone and are not organized 
on any topical basis. 


3. Rate each dimension independently. Avoid a syndrome approach 
which results in pattern responses. 


4. Use the full scale. All seven intervals in the continuum 
are to be considered. The end-points are to be utilized, 
and the mid-point (the average) is not to be over-utilized. 


5. Our advice from measurement experts is that interviewers 


should be urged to attempt to complete every item if at 
all possible. It is crucial to have the data quite complete. 


6. The section titled Order of Important Attributes asks inter- 
viewers to rank-order the six major factors, from 1 to 6 
with no ties. Frequently, this is a difficult section to 
complete. (It was designed to provide aya TEN validation 
in subsequent research. ) 


~ 
> 


The Comments portion will ordinarily be left blank. 


8. By way of reassurance, the assessment instrument deals 
with approximations and does not call for agonizing 
appraisal by the rater. So do your careful best, and 


then relax. 


Associates may enclose completed Teaching Assessment 
forms along with the regular Associate Teacher's Report in the 
self-addressed envelopes from Professor Hennessy's office. 


Please be assured that we do appreciate your assistance 
in this matter. 


Wee S.oPeruniak 
Assistant Dean 


